Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CString should have a "within max length" constructor #20475

Closed
shepmaster opened this issue Jan 3, 2015 · 9 comments
Closed

CString should have a "within max length" constructor #20475

shepmaster opened this issue Jan 3, 2015 · 9 comments
Labels
C-feature-accepted Category: A feature request that has been accepted pending implementation. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@shepmaster
Copy link
Member

This Stack Overflow question points out this relatively-common C idiom (trimmed down to show the interesting bits):

let mut errbuf = [0 as c_char, ..256];
c_function_that_fails(errbuf.as_mut_ptr() as *mut libc::c_char);
CString::new(errbuf.as_ptr(), false);

The questioner wonders about the case where errbuf is not correctly set and no NUL bytes occur in the buffer. When we go to use the string, we might merrily walk our way through memory! Since we know the length of errbuf, we should be able to check that there is a NUL in the first X bytes and fail otherwise.

While looking to see how Rust handles this case, I found the following spots that could suffer from the same problem:

I propose we add a new constructor to CString: of_max_len. This would make sure that a NUL byte occurs within a certain number of bytes. If it doesn't, we panic!.

I think we could also make this particular case easier with a constructor like from_slice, which could be passed the slice which would implicitly know its length.

I'm more than welcome to feedback of all kinds!

@alexcrichton
Copy link
Member

cc #20507, The purpose of CString is changing pretty radically in that redesign, but this still sounds relevant to the added c_str_to_bytes functions.

@aturon aturon added A-libs A-FFI Area: Foreign function interface (FFI) labels Jan 4, 2015
@chrish42
Copy link

chrish42 commented Jan 5, 2015

Actually, I guess what I was really asking was for that use case to be made easier. Ideally, I think something like this would be nice:

  1. a constructor that builds a CString, which contains a [c_char, ..MAX_LEN], initialized to \0;
  2. a function that can convert this to a *mut c_char, to pass to C code (which may write a C string into this buffer);
  3. once the C function has returned, a function that converts the array into a Rust string, panicking if there is no \0 within MAX_LEN. (Without needing to repeat said max length.)

Anyways, I think the use case of "allocate a buffer, pass it to a function, which may write a string into it" is not that uncommon for C FFI, and it's be nice if it was easy to do (and in a way that's not error-prone).

(Apologies if I'm getting some of the Rust terminology wrong here. I just started learning Rust about a week ago.)

@alexcrichton
Copy link
Member

Note that one idea I had on #20507 was that this use case could be handled with slice::from_raw_buf because the length is known so it can be converted to a slice immediately (and then a find for the 0 and slice afterwards).

@chrish42 for your use cases I would expect:

  1. This should in theory be handled with a small vector optimization which can store data inline in the CString itself (which now uses Vec internally).
  2. You can currently do that today with let mut buf = [0; LEN]; foo(buf.as_mut_ptr(), buf.len());, I don't think we need to involve CString with that.
  3. I think that this is the use case for this bug, which may be solvable with my idea above.

@jimrandomh
Copy link

This isn't just for error buffers, and this is very problematic even if the function is guaranteed to set the buffer and terminate it with a null. As part of my first Rust project, I'm trying to call ptsname_r to get a string, and pass the result to open(2). Hours later, I still haven't found a way to do this correctly, even awkwardly. The given example does not compile in current versions of Rust because CString::new doesn't take a second argument, and CString::new panics if there are any nulls in the Vec (including the trailing null). I could write my own strlen, slice the vec, and make a CString that way... but at that point it's clear the language isn't ready for this and my project would need a rewrite later.

@steveklabnik steveklabnik added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 29, 2016
@Mark-Simulacrum Mark-Simulacrum added C-feature-request Category: A feature request, i.e: not implemented / a PR. and removed A-FFI Area: Foreign function interface (FFI) C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Sep 10, 2017
@dtolnay dtolnay added C-feature-accepted Category: A feature request that has been accepted pending implementation. and removed C-feature-request Category: A feature request, i.e: not implemented / a PR. labels Nov 19, 2017
@dtolnay
Copy link
Member

dtolnay commented Nov 19, 2017

I believe the new (as of 1.0.0-alpha 😛) CString API supports "within max length" use cases as follows.

use std::cmp;
use std::ffi::CString;
use std::mem;
use std::ptr;

unsafe fn c_function(buf: *mut u8, len: usize) {
    let data = b"hello\0";

    // buf may or may not end with \0 depending on whether there was room for it
    ptr::copy(data.as_ptr(), buf, cmp::min(len, data.len()));
}

fn call_the_thing() -> CString {
    unsafe {
        let mut buf = mem::uninitialized::<[u8; 256]>();
        c_function(buf.as_mut_ptr(), buf.len());
        let end = buf.iter().position(|ch| *ch == b'\0').unwrap_or(buf.len());
        CString::from_vec_unchecked(buf[..end].to_vec())
    }
}

fn main() {
    println!("{:?}", call_the_thing().as_bytes());
}

I would welcome a PR to support this better.

@dtolnay
Copy link
Member

dtolnay commented Dec 30, 2017

@Diggsey made an excellent crate to support this use case and others involving a fixed size buffer which may or may not contain a null terminator, so I think we are ready to close the issue.

let mut buf = mem::uninitialized::<[c_char; 256]>();
let ptr = buf.as_ptr();
let len = buf.len();

/* ... */

// copy the fixed length stack buffer which may or may not
// have null terminator into an owned CString
CFixedStr::from_ptr(ptr, len).to_owned().into_c_string()

@dtolnay dtolnay closed this as completed Dec 30, 2017
@Diggsey
Copy link
Contributor

Diggsey commented Dec 30, 2017

The crate is https://crates.io/crates/c_fixed_string

Bit early to say it's excellent just yet but thanks 😛

@dtolnay
Copy link
Member

dtolnay commented Dec 30, 2017

😆 I promise I meant to link to the crate.

@Diggsey
Copy link
Contributor

Diggsey commented Dec 30, 2017

Also, it would be better to write CFixedStr::from_ptr(ptr, len).to_c_str() as that will avoid copying data past the first null byte and will not perform an allocation unless required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-accepted Category: A feature request that has been accepted pending implementation. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants