Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove the destination type parameter from all cast builtins #5909

Closed
andrewrk opened this issue Jul 22, 2020 · 24 comments
Closed

remove the destination type parameter from all cast builtins #5909

andrewrk opened this issue Jul 22, 2020 · 24 comments
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Jul 22, 2020

@as is unchanged. These are affected:

  • @floatCast
  • @intCast
  • @alignCast
  • @bitCast
  • @errSetCast
  • @ptrCast
  • @truncate

The motivation here is to prevent bugs. Here's an example case:

const Data = struct {
    bits: u8,
};

fn updateBits(data: *Data) void {
    // ...
    const y: usize = indexOfSomethingGuaranteedToFitInBits();
    data.bits = @intCast(u8, y);
    // ...
}

Here, we obtain y, a usize but for whatever reason is guaranteed to fit into the bits field of Data. So we use @intCast here to cast it.

Now, we rework the code, and actually we need to add some more bits, making it a u16:

const Data = struct {
    bits: u16,
};

fn updateBits(data: *Data) void {
    // ...
    const y: usize = indexOfSomethingGuaranteedToFitInBits();
    data.bits = @intCast(u8, y);
    // ...
}

We've now introduced a bug into the code. The bits field is updated, but the @intCast is still casting too small. The lack of type inference here has actually made the code buggier.

The original code theoretically should have used data.bits = @intCast(@TypeOf(data.bits), y) however this is so cumbersome that it's rarely done in practice, especially when data.bits is a more complicated expression.

With this proposal, the original code would look like this:

const Data = struct {
    bits: u8,
};

fn updateBits(data: *Data) void {
    // ...
    const y: usize = indexOfSomethingGuaranteedToFitInBits();
    data.bits = @intCast(y);
    // ...
}

And the updated code (which only changes the type of the bits field to u16) remains correct.

With this proposal, it is still possible to get the older semantics:

const Data = struct {
    bits: u8,
};

fn updateBits(data: *Data) void {
    // ...
    const y: usize = indexOfSomethingGuaranteedToFitInBits();
    data.bits = @as(u8, @intCast(y));
    // ...
}

A cast operation with an inferred result location type would be a compile error:

test "example" {
    var y: i32 = 1;
    const x = @intCast(y); // error: unable to infer result type \n hint: coerce the result using `@as`
}
@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Jul 22, 2020
@andrewrk andrewrk added this to the 0.7.0 milestone Jul 22, 2020
@zigazeljko
Copy link
Contributor

Why are @bitCast and @ptrCast also affected? The issue with casting too small does not occur there.

@pixelherodev
Copy link
Contributor

pixelherodev commented Jul 22, 2020 via email

@michal-z
Copy link
Contributor

michal-z commented Jul 22, 2020

Maybe we should allow this:

data.bits = @intCast(_, y);

In this case @intCast would try to infer the type. This would be consistent with array syntax ([_]).

@pixelherodev
Copy link
Contributor

pixelherodev commented Jul 22, 2020 via email

@andrewrk
Copy link
Member Author

Why are @bitCast and @ptrCast also affected? The issue with casting too small does not occur there.

Consistency across casting primitives. Note that callsites where this is inconvenient are very uncommon, because of this pattern:

const b: u32 = @bitCast(a);
const y: *u32 = @ptrCast(x);

// This is much clunkier, even if it's not a big deal for simpler cases.

@as isn't supposed to be clunky. Any clunkiness is a syntax flaw. Type coercion is supposed to be encouraged and one of the most syntactically simple things to do. For the purposes of this proposal, it should be assumed that @as is not clunky.

@Snektron
Copy link
Collaborator

Snektron commented Jul 22, 2020

One thing i don't like about both @andrewrk and @michal-z's proposals is that it adds functionality especially tailored for these built ins. I believe that the latter proposal is more promising, provided this functionality is extended to be valid for all generic functions. I do think though that it could generate problems with functions where multiple comptime parameters are provided, but that could be solved simply by stating that _ is equal to the expected type of the expression. Another option would be a new builtin which returns the expression's expected type, which on one hand would be more readable as it directly states whats happening, but on the other hand you would also need to type more (a thing which is also gained by removing the destination type).

I'm not sure whether i'm a fan of this type if inference, but in my opinion it beats @andrewrk's initial proposal as it is both more orthagonal (and doesn't provide a feature specific to these builtin calls), and it is clearer to the reader that the destination type is inferred.

Anyway, as a hypothetical situation, that functionality could again be extended to return types of functions, where the following is valid code:

fn add(a: _, b: _) _ {
  return a + b;
}

This would also allow for example calculation of constants for a certain type:

fn pi() _ {
  const T = @TypeOf(_); // TODO: Improve
  ...
  return calculated_pi_value;
}

As i type this i realize this is more in line with the original proposal though, so the latter paragraph may need some thinking.

@SpexGuy
Copy link
Contributor

SpexGuy commented Jul 22, 2020

I think the original proposal is a great idea. With this, type coercion is the only thing that changes types, and casts define how that change happens and which things are allowed or UB when performing the conversion. Type coercion tends to occur in only three places: moving values between structs and locals, passing values as parameters, and interacting with anytype. In the first two cases, the target type is known. In the third case, the syntax is unified: when assigning to anytype, use @as to change the type and @fooCast if you need to enable special behavior when performing that change.

There's a fourth case where you want to convert to some intermediate type and then coerce to a known target type, but if the target type is different from the cast type you are performing two coercions and IMO the syntax should make that clear.

@SpexGuy
Copy link
Contributor

SpexGuy commented Jul 22, 2020

I also think @ResultType() is a good idea, and should maybe be allowed as the return type of a function, as in @Snektron 's example. But I think that can be a separate proposal.

@andrewrk andrewrk added the accepted This proposal is planned. label Jul 22, 2020
@frmdstryr
Copy link
Contributor

Doesn't this fix one error but open up (or increase the likeliness) a whole set of other errors?

Eg, What happens above if the data field type is changed from u8 to i8?

The explicit type parameter is often a double check (at least for me) and makes it more clear what is going on without having to dig through more code. @michal-z 's syntax makes the most sense to me because it doesn't lose the explicit intent of the cast (where _ just means I don't care).

@andrewrk
Copy link
Member Author

andrewrk commented Jul 24, 2020

The explicit type parameter is often a double check (at least for me) and makes it more clear what is going on without having to dig through more code

You have @as for that, and you can use it to accomplish this anywhere. @as feels a bit more cumbersome than I would like it to, it's supposed to be a convenient & happy primitive to reach for any time.

In most cases you should be able to accomplish this by using the form const foo: T = @intCast(bar); and the coercion to T is what you wanted to document.

@ghost
Copy link

ghost commented Nov 25, 2020

@andrewrk This proposal could be made more ergonomic by borrowing C's coercion syntax: (T)val, where the parentheses are mandatory. So for instance (u16)@fromEnum(e) or (uptr)@bitCast(p). Coercion would bind tighter than binary operators but looser than unary. I think this is clear enough and won't lead to spaghetti code, and as far as I know it introduces no syntactic ambiguity (it cannot clash with function calls because they operate exclusively on different sides of an expression, nor with grouping because you cannot juxtapose groups). Coercion is a fairly fundamental language feature, I don't think we should have to go through a builtin for it.

@Rageoholic
Copy link
Contributor

Rageoholic commented Nov 25, 2020

So maybe I'm being stupid here but it seems like the problem is that implicit upcasting is masking the bug. It feels like maybe we could just remove the implicit upcasting? That way we keep the current cast syntax and we prevent this bug from being masked. It might be a little in a few places but it does let us just punt on the problem.

andrewrk added a commit that referenced this issue Apr 20, 2021
It is not yet time to implement #5909.
andrewrk added a commit that referenced this issue Apr 22, 2021
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021
@N00byEdge
Copy link
Sponsor Contributor

I sometimes have to turn a usize/u64 from a syscall into a i32/c_int for C interop. Today I do the following:

return @intCast(i32, @bitCast(i64, val));

Would this then look like the following in the future?

return @intCast(@as(i64, @bitCast(val)));

I'm having a bit of a hard time thinking about what this code will look like post-proposal.

@andrewrk
Copy link
Member Author

andrewrk commented Oct 2, 2021

Yes that's correct, assuming the return type of the function is i32

@N00byEdge
Copy link
Sponsor Contributor

N00byEdge commented Apr 15, 2022

I agree that the type of the resulting expression but the type parameter does make it clearer in assignments (not decls) what the cast is doing, even if it doesn't have much meaning.

The type is already given at the point of the decl. Writing foo = @as(@TypeOf(foo), bar) everywhere doesn't make the code clearer, and putting in explicit types is a violation of DRY and a bug begging to happen. But regardless, @as(T, ...) (or as(T, ...) per my suggestion) will still be available if you really want to do this.

Oh yeah, I definitely didn't mean that @as(@TypeOf would be more readable, but that @as(u64 would be. I do recognize that it repeats value that can be hard to maintain, but it is more readable, at least to me.

@N00byEdge
Copy link
Sponsor Contributor

This would mean you can't implement a cast with the same API in userspace anymore.

Another argument for @ResultType() (or perhaps it should be called @InferredResultType())

    pub inline fn intCast(x: anytype) @ResultType() {
        return @intCast(x);
}

There are a bunch more concepts like @ResultType() that only apply to inline function like @callerSrc() etc I would like to explore, but I'm not sure this should replace @as until there is a need to implement casts in userspace.

@jibal
Copy link

jibal commented Apr 15, 2022

Oh yeah, I definitely didn't mean that @as(@TypeOf would be more readable

I understand ... I wrote it that way to make a point. The most important fundamental rule in software development is DRY. Each policy decision that needs to be made should be made in as few places as possible, preferably one ... otherwise you invariably get mismatches as the code evolves.

but that @as(u64 would be. I do recognize that it repeats value that can be hard to maintain, but it is more readable, at least to me.

My claim is that this is never what you want. There are two reasons to limit the size of an int: be be compatible with something else, or for performance--memory or speed. That decision should be made in one place (Zig makes this easy with a type assignment, rather than special alias or typedef keywords). Adding explicit types throughout the code makes it less readable because it's not important or useful information. There's no need to know the specific size of an integer, and every cast gives reason to a code reader to wonder why it's there and why that size in particular is there. There's no need to know that it's a concrete value like u64, rather than whatever abstract type it represents--which should be done through well-chosen names. Zig makes it easy to pick the right sizes--I make liberal use of std.math.IntFittingRange ... and there's a proposal that I'm strongly in favor of to be able to specify lower and upper bounds on integer types rather than merely bitcounts--that lets you match the abstraction, rather than hardware details.

Anyway, I'm getting too much into philosophy--although it's quite relevant here because language design decisions about how much type inference to do depend on it.

On the other matter: there's no semantic difference between inline functions and non-inline functions and there should not be--it's strictly a performance issue. Neither @ResultType nor @callerSrc are specific to inline functions. Rather, @callerSrc should a take a level argument for how far back in the call stack you want the source information. @callerSrc without that isn't useful because the point in a package where you issue diagnostics can be arbitrarily deep. Actually, ideally, instead of a level there should be a way to specify the last caller that was outside of some module. Anyway, this is off-topic ... I had simply pointed out that @ResultType, which was mentioned previously, would solve the specific issue you raised about user functions not being able to get to the inferred result types the way these @ functions would (under the proposal). Oh, as for "replace @as" ... my note pointed out that it can already be replaced in userspace; it is, AFAICS, completely superfluous, and if so I would think that Andrew would want to get rid of it.

@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022
@andrewrk andrewrk modified the milestones: 0.11.0, 0.12.0 Jun 19, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 22, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 22, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 23, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 23, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 23, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 23, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 24, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 24, 2023
mlugg added a commit to mlugg/zig that referenced this issue Jun 24, 2023
@andrewrk andrewrk modified the milestones: 0.12.0, 0.11.0 Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests