Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Consider turning as into a user-implementable Cast trait #7080

Closed
erickt opened this issue Jun 12, 2013 · 16 comments
Closed

RFC: Consider turning as into a user-implementable Cast trait #7080

erickt opened this issue Jun 12, 2013 · 16 comments

Comments

@erickt
Copy link
Contributor

erickt commented Jun 12, 2013

@cmr, @huonw and I were talking in irc about how to name functions that allow you to convert from one type to another. For example, consider these str methods:

fn to_bytes(&str) -> ~[u8];
fn as_bytes<'a>(&'a str) -> &'a [u8]

These implement a common pattern, where in the to case, we are copying the string into the new vec. In the as case, we are making a no-copy cast from a string to a vector.

It'd be nice if we could standardize this pattern, and one way we could do this is to turn the as operator into a trait a user can implement, like the Neg trait. We can do this if we follow the pattern @nikomatsakis laid out on his blog. Here's a working example implementation:

use std::io;

trait Cast<LHS> {
    fn cast(LHS) -> Self;
}

////

trait IntCast {
    fn cast_int(&self) -> int;
}

impl<LHS: IntCast> Cast<LHS> for int {
    fn cast(x: LHS) -> int { x.cast_int() }
}

impl IntCast for i8 {
    fn cast_int(&self) -> int { *self as int }
}

impl IntCast for i16 {
    fn cast_int(&self) -> int { *self as int }
}

////

trait StrCast {
    fn cast_str(&self) -> ~str;
}

impl<LHS: StrCast> Cast<LHS> for ~str {
    fn cast(x: LHS) -> ~str { x.cast_str() }
}

impl<'self> StrCast for &'self [u8] {
    fn cast_str(&self) -> ~str { self.to_str() }
}


////

trait StrSliceCast<'self> {
    fn cast_str_slice(&self) -> &'self str;
}

impl<'self, LHS: StrSliceCast<'self>> Cast<LHS> for &'self str {
    fn cast(x: LHS) -> &'self str { x.cast_str_slice() }
}

impl<'self> StrSliceCast<'self> for &'self [u8] {
    fn cast_str_slice(&self) -> &'self str {
        unsafe {
            assert!(std::str::is_utf8(*self));
            let (ptr, len): (*u8, uint) = std::cast::transmute(*self);
            std::cast::transmute((ptr, len + 1))
        }
    }
}

fn main() {
    io::println(fmt!("%?", Cast::cast::<i8, int>(5_i8)));
    io::println(fmt!("%?", Cast::cast::<i16, int>(5_i16)));

    io::println(fmt!("%?", Cast::cast::<&[u8], ~str>(bytes!("hello world"))));
    io::println(fmt!("%?", Cast::cast::<&[u8], &str>(bytes!("hello world"))));
}

While it's a bit wordy, it does work.

Unfortunately there's a third conversion option that I couldn't figure out how to fit into this paradigm, where we consume the input to produce the output:

fn to_option<T, U>(r: Result<T, U>) -> Option<T> {
    match r {
        Ok(x) => Some(x),
        Err(_) => None,
    }
}

Other places this would be useful is when we can cheaply transform ~str into ~[u8], or consume all the elements from a HashMap and move them into a ~[T]. Perhaps a separate function would be best to capture this case. Or we do the reverse, and say Cast consumes the input, but since going from one reference type to another is cheap, we optimize that specific case. I'm not sure.

@erickt
Copy link
Contributor Author

erickt commented Jun 12, 2013

You can find a version that's using moves here: https://gist.github.com/erickt/5762933, but unfortunately I'm getting this llvm bug

Assertion failed: (castIsValid(op, S, Ty) && "Invalid cast!"), function Create, file /Users/etryzelaar/Projects/rust/rust/src/llvm/lib/IR/Instructions.cpp, line 2290.

I expect it's related to #4759.

@Kimundi
Copy link
Member

Kimundi commented Jun 12, 2013

Note that the reason for a build-in as is mainly casts in constexprs. Also, it's used for casting to a Trait object. Replacing as with methods, or turning as into sugar for them is a nice idea, but there needs to be solution for sth like this:

static FOO: char = 159u8 as char;
static BAR: u8 = '?' as u8 + 45;

However, seeing how

  1. Explicit casting to Traitobjects might get removed and replaced with implicit coercion,
  2. as is only used for arithmetic casts otherwise,
  3. Trait-based casts are more versatile and user-extensible.

It might be sensible to replace the constexpr-use case with a static_cast!() syntax extension:

static FOO: char = static_cast!(159u8 as char);
static BAR: u8 = static_cast!('?' as u8) + 45;

which would be hard-coded to expand to a literal of the right type.
In fact, such a extension would nicely subsume the current bytes!()-se.

@emberian
Copy link
Member

Note that I've been thinking about CTFE and how it can be used with static items. I think @bblum's effect proposal could easily express this: only functions/methods without certain effects are valid for consideration for CTFE. This would fit in nicely with #[static_assert] (for example, I want to assert that two &'static[&'static str] are the same size , but currently cannot because .len() (a method call) is not currently allowable). Combine that with an as trait and perhaps it could be allow in a static context? Although I'm not sure how well that'd work out in practice: sometimes you want the conversion to have effects, or need effects to actually do the cast. So maybe static_cast!() isn't a bad idea (or just special-casing the numeric casts).

I didn't consider how this would interact with trait objects and I don't know enough about them to evaluate its effect on them.

@erickt
Copy link
Contributor Author

erickt commented Jun 12, 2013

@Kimundi: curse you, constant expressions! Bane of my existence. We could tone this down, and provide these traits to standardize how we transform from one type to another, and keep as around for the constant expressions.

Another option is to copy the bytes! syntax extension, and implement enough compile-time-typecasts to cover cases like you mentioned:

static FOO: char = char!(159u8);
static BAR: u8 = u8!('?') + 45;

Either way, it's not that common to use these static casts, at least in rust proper. So I wouldn't be too upset if it was a little uglier to use if it allowed us to take better advantage of the as operator, or even reclaim it for the user. Here's all the uses I could find:

libstd/managed.rs:    pub static RC_EXCHANGE_UNIQUE : uint = (-1) as uint;
libstd/managed.rs:    pub static RC_MANAGED_UNIQUE : uint = (-2) as uint;
libstd/num/int_macros.rs:pub static min_value: $T = (-1 as $T) << (bits - 1);
libstd/num/int_macros.rs:pub static max_value: $T = min_value - 1 as $T;
libstd/num/strconv.rs:static inf_buf:          [u8, ..3] = ['i' as u8, 'n' as u8, 'f' as u8];
libstd/num/strconv.rs:static positive_inf_buf: [u8, ..4] = ['+' as u8, 'i' as u8, 'n' as u8,
libstd/num/strconv.rs:static negative_inf_buf: [u8, ..4] = ['-' as u8, 'i' as u8, 'n' as u8,
libstd/num/strconv.rs:static nan_buf:          [u8, ..3] = ['N' as u8, 'a' as u8, 'N' as u8];
libstd/num/strconv.rs:priv static DIGIT_P_RADIX: uint = ('p' as uint) - ('a' as uint) + 11u;
libstd/num/strconv.rs:priv static DIGIT_I_RADIX: uint = ('i' as uint) - ('a' as uint) + 11u;
libstd/num/strconv.rs:priv static DIGIT_E_RADIX: uint = ('e' as uint) - ('a' as uint) + 11u;
libstd/num/uint_macros.rs:pub static min_value: $T = 0 as $T;
libstd/num/uint_macros.rs:pub static max_value: $T = 0 as $T - 1 as $T;
libstd/rand.rs:static scale : f64 = (u32::max_value as f64) + 1.0f64;
libstd/rand.rs:        static midpoint: uint = RAND_SIZE as uint / 2;
libextra/arena.rs:static tydesc_drop_glue_index: size_t = 3 as size_t;
libextra/num/bigint.rs:    priv static hi_mask: uint = (-1 as uint) << bits;
libextra/num/bigint.rs:    priv static lo_mask: uint = (-1 as uint) >> bits;
libsyntax/abi.rs:static IntelBits: u32 = (1 << (X86 as uint)) | (1 << (X86_64 as uint));
libsyntax/abi.rs:static ArmBits: u32 = (1 << (Arm as uint));
librustc/lib/llvm.rs:pub static True: Bool = 1 as Bool;
librustc/lib/llvm.rs:pub static False: Bool = 0 as Bool;
librustpkg/path_util.rs:pub static u_rwx: i32 = (S_IRUSR | S_IWUSR | S_IXUSR) as i32;
test/bench/shootout-fasta-redux.rs:static LOOKUP_SCALE: f32 = (LOOKUP_SIZE - 1) as f32;
test/bench/shootout-fasta-redux.rs:static NULL_AMINO_ACID: AminoAcid = AminoAcid { c: ' ' as u8, p: 0.0 };
test/bench/shootout-k-nucleotide.rs:static TABLE: [u8, ..4] = [ 'A' as u8, 'C' as u8, 'G' as u8, 'T' as u8 ];
test/compile-fail/const-cast-different-types.rs:static b: *u8 = a as *u8; //~ ERROR non-scalar cast
test/compile-fail/const-cast-different-types.rs:static c: *u8 = &a as *u8; //~ ERROR mismatched types
test/compile-fail/const-cast-wrong-type.rs:static a: [u8, ..3] = ['h' as u8, 'i' as u8, 0 as u8];
test/compile-fail/const-cast-wrong-type.rs:static b: *i8 = &a as *i8; //~ ERROR mismatched types
test/run-pass/const-autoderef.rs:static A: [u8, ..1] = ['h' as u8];
test/run-pass/const-cast-ptr-int.rs:static a: *u8 = 0 as *u8;
test/run-pass/const-cast.rs:static y: *libc::c_void = x as *libc::c_void;
test/run-pass/const-cast.rs:static b: *int = a as *int;
test/run-pass/const-enum-cast.rs:    static c1: int = A2 as int;
test/run-pass/const-enum-cast.rs:    static c2: int = B2 as int;
test/run-pass/const-enum-cast.rs:    static c3: float = A2 as float;
test/run-pass/const-enum-cast.rs:    static c4: float = B2 as float;
test/run-pass/const-str-ptr.rs:static a: [u8, ..3] = ['h' as u8, 'i' as u8, 0 as u8];
test/run-pass/const-str-ptr.rs:static b: *u8 = c as *u8;

@emberian
Copy link
Member

@erickt I think with the effect system we could even have the trait be CTFE for the current static uses. It could be hardcoded in the interim, just to keep things pretty. @bblum?

@emberian
Copy link
Member

Heh, I already left a comment to that end, nevermind 😊

@bblum
Copy link
Contributor

bblum commented Jun 17, 2013

A constexpr effect would be tractable, I think. Might be an issue with asserts -- if you wanted to write asserts in your function but have it be constexpr anyway, either you would have to pick one, or the compiler would be smart enough to turn the assert into a compile fail (which seems both possible and pretty leet, but difficult). The latter strikes me as a far-future feature.

Also keep in mind effects won't be in 1.0, so it would have to be backwards-compatible.

@emberian
Copy link
Member

@erickt I think hardcoding in for the existing cases would be fine.

@emberian
Copy link
Member

emberian commented Sep 5, 2013

Nominating for well-defined: this is a language feature.

@thestinger
Copy link
Contributor

I'm against doing this because I don't think one type of conversion should be elevated above the others at a language level. For example, you can downcast to a smaller integer type by truncating, or you can clamp to the max value.

@emberian
Copy link
Member

emberian commented Sep 5, 2013

That's a good argument. Having multiple types of conversions makes an overloadable as less straightforward and useful.

@thestinger
Copy link
Contributor

I would be happiest if we lived in a world where Rust had CTFE, and we didn't need as in the language at all. A special case like that feels like a language wart to me.

@catamorphism
Copy link
Contributor

Declining for milestone. We decided this is part of the larger story about constant evaluation, and there are already other bugs open on that.

@nikomatsakis
Copy link
Contributor

My two cents: The reason we have as at all is to distinguish
casts that can be done in constants. If we were going to use a
trait, let's just make it a normal trait (or multiple, as
strcat suggests). This would also solve the problem of the
precedence of as. Therefore this really comes down to
deciding on our constant strategy.

@emberian
Copy link
Member

emberian commented Sep 5, 2013

@catamorphism which bugs?

@pnkfelix
Copy link
Member

pnkfelix commented Sep 9, 2013

@cmr I think #5551 is the relevant metabug here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants