-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZendStr should accept strings containing null bytes. #200
Comments
In ZendStr::new() the References: |
ju1ius
changed the title
Uncaught Exception: String given contains NUL-bytes which cannot be present in a C string
ZendStr should accept string containing null bytes.
Nov 26, 2022
ju1ius
changed the title
ZendStr should accept string containing null bytes.
ZendStr should accept strings containing null bytes.
Nov 26, 2022
ju1ius
added a commit
to ju1ius/ext-php-rs
that referenced
this issue
Nov 26, 2022
ju1ius
added a commit
to ju1ius/ext-php-rs
that referenced
this issue
Nov 26, 2022
ju1ius
added a commit
to ju1ius/ext-php-rs
that referenced
this issue
Nov 26, 2022
ju1ius
added a commit
to ju1ius/ext-php-rs
that referenced
this issue
Nov 26, 2022
ju1ius
added a commit
to ju1ius/ext-php-rs
that referenced
this issue
Nov 27, 2022
ju1ius
added a commit
to ju1ius/ext-php-rs
that referenced
this issue
Nov 27, 2022
ju1ius
added a commit
to ju1ius/ext-php-rs
that referenced
this issue
Nov 28, 2022
ptondereau
pushed a commit
that referenced
this issue
Dec 9, 2022
Closes #200 ## Rationale In PHP zend_strings are binary strings with no encoding information. They can contain any byte at any position. The current implementation use `CString` to transfer zend_strings between Rust and PHP, which prevents zend_strings containing null-bytes to roundtrip through the ffi layer. Moreover, `ZendStr::new()` accepts only a `&str`, which is incorrect since a zend_string is not required to be valid UTF8. When reading the PHP source code, it is apparent that most functions marked with `ZEND_API` that accept a `const *char` are convenience wrappers that convert the `const *char` to a zend_string and delegate to another function. For example [zend_throw_exception()](https://github.com/php/php-src/blob/eb83e0206c9b9261d786943bf2c5ad61dca287e2/Zend/zend_exceptions.c#L823) takes a `const *char message`, and just converts it to a zend_string before delegating to [zend_throw_exception_zstr()](https://github.com/php/php-src/blob/eb83e0206c9b9261d786943bf2c5ad61dca287e2/Zend/zend_exceptions.c#L795). I kept this PR focused around `ZendStr` and it's usages in the library, but it should be seen as the first step of a more global effort to remove usages of `CString` everywhere possible. Also, I didn't change the return type of the string related methods of `Zval` (e.g. I could have made `Zval::set_string()` accept an `impl AsRef<[u8]>` instead of `&str` and return `()` instead of `Result<()>`). If I get feedback that it should be done in this PR, I'll do it. ## Summary of the changes: ### ZendStr * [BC break]: `ZendStr::new()` and `ZendStr::new_interned()` now accept an `impl AsRef<[u8]>` instead of just `&str`, and are therefore infaillible (outside of the cases where we panic, e.g. when allocation fails). This is a BC break, but it's impact shouldn't be huge (users will most likely just have to remove a bunch of `?` or add a few `Ok()`). * [BC break]: Conversely, `ZendStr::as_c_str()` now returns a `Result<&CStr>` since it can fail on strings containing null bytes. * [BC break]: `ZensStr::as_str()` now returns a `Result<&str>` instead of an `Option<&str>` since we have to return an error in case of invalid UTF8. * adds method `ZendStr::as_bytes()` to return the underlying byte slice. * adds convenience methods `ZendStr::as_ptr()` and `ZendStr::as_mut_ptr()` to return raw pointers to the zend_string. ### ZendStr conversion traits * adds `impl AsRef<[u8]> for ZendStr` * [BC break]: replaces `impl TryFrom<String> for ZBox<ZendStr>` by `impl From<String> for ZBox<ZendStr>`. * [BC break]: replaces `impl TryFrom<&str> for ZBox<ZendStr>` by `impl From<&str> for ZBox<ZendStr>`. * [BC break]: replaces `impl From<&ZendStr> for &CStr` by `impl TryFrom<&ZendStr> for &CStr`. ### Error * adds new enum member `Error::InvalidUtf8` used when converting a `ZendStr` to `String` or `&str`
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
PHP strings are allowed to contain null bytes, but the current implementation of ZendStr doesn't allow it.
Expected result:
Actual result:
The text was updated successfully, but these errors were encountered: