Compiler: remove the special logic of union of pointer and nil #9872

asterite · 2020-11-01T11:54:39Z

Making Pointer(T) | Nil be represented as a pointer is unsound: when it's a null pointer it's not clear whether that's nil or Pointer(T). When interfacing with C this doesn't matter, but when not it can be problematic. And since pointers can be used for things that aren't just C bindings, I think we should change this.

It's a small breaking change, but nothing huge. Most people probably don't use nilable pointer types explicitly.

jhass · 2020-11-01T11:58:49Z

I think passing nil for a null pointer is extraordinarily common actually, so I wouldn't call this a "small" breaking change by any means.

jhass · 2020-11-01T12:04:01Z

Can we easily preserve this in just external function calls? That is I see the following as the main usecase:

lib LibFoo
  fun something(name : UInt8*)
end

LibFoo.something(nil)

asterite · 2020-11-01T12:05:53Z

That's a special rule when calling C functions: if a nil literal is passed, it works fine. So this is muy changed by this PR.

jhass · 2020-11-01T12:07:51Z

Ah, but

lib LibFoo
  fun something(name : UInt8*)
end

def something(name : String? = nil)
  LibFoo.something(name)
end

something

will break?

asterite · 2020-11-01T12:28:11Z

That still works, though I don't know why. I guess if the type passed is Nil then it's considered a null pointer.

It breaks if you have a union type:

lib LibFoo
  fun something(name : UInt8*)
end

name = rand < 0.5 ? nil : Pointer(UInt8).new(10)
LibFoo.something(name)

The above used to work but now it doesn't.

By the way, this also doesn't work:

lib LibFoo
  fun something(name : UInt8*)
end

name = rand < 0.5 ? nil : "hello"
LibFoo.something(name) # Error: argument 'name' of 'LibFoo#something' must be Pointer(UInt8), not (String | Nil)

I feel that we could also make that work, but we'll be adding more and more special rules when using Pointer(UInt8).null instead of nil is not that terrible, and a union type is not formed.

Then, passing an explicit nil literal is comparable to autocasting, which only works with explicit literals.

So, in general, I think this PR is good because:

it simplifies the compiler
it removes some special rules that people will have to learn
it makes the language more sound

asterite · 2020-11-01T12:31:45Z

Also, maybe you intended to pas a pointer to the C lib but for some reason you ended up with an unintended nil. Then you get a segfault or a wrong behavior and you don't know why. So this change could actually help spotting such bugs earlier.

In that sense, I think passing a Nil type to C functions (not just a nil literal) should be an error too, but I'll leave that for another PR if this is desired.

jhass · 2020-11-01T12:36:34Z

Well, where I'm coming from is jhass/crystal-gobject of course and Gtk and friends have tons and tons of APIs with tons and tons of optional parameters. And in C an optional parameter means pass me a pointer or a null pointer, because that's just 0 or NULL. So actually the binding generator has to generate a lot of code like

def something(a : String? = nil, b : String? =  nil)
  LibFoo.something(a ? a.to_unsafe : nil, b ? b.to_unsafe : nil)
end

and more complex things and I guess I was hoping to actually reduce that boilerplate a bit :D (The passing of the for example String?union doesn't always work in more complex calls for some reason already, so the binding generator is already doing the boilerplate ternary everywhere).

asterite · 2020-11-01T13:51:33Z

It's a good point. But letting nil pass just like that to a place that expects a pointer might be a source of bugs. I know we are a bit lax when it comes to interfacing with C, but maybe it's good if we make things a bit more strict.

I'll wait for input from others.

HertzDevil · 2020-11-01T14:27:42Z

I'm fine with this if it makes the Crystal side more robust and doesn't affect existing C passing semantics.

What are the arguments against simply converting nil to the null pointer for C bindings? I don't think any other C value corresponds to Crystal's nil; you can't use Nil inside a lib, and lib funs cannot take arguments of type Void either.

My idea is that nilable types can be passed to C bindings directly if nil corresponds to the null pointer, even for nilable pointers after this PR. (The notion of "the" null pointer is important; it should be implicitly convertible to any other pointer type. This is exactly C++'s std::nullptr_t.)

asterite · 2020-11-01T14:45:46Z

The question is: what if nil is accidentally passed to a function expecting a pointer? There's no way to detect that. And given we have types we should use them to provide type safety.

HertzDevil · 2020-11-01T15:02:29Z

I don't think C needs to detect the difference between pointers and nil if the latter is defined to always produce the null pointer in unsafe code. Not having the codegen mutate null nilable pointers to nil in Crystal code is good enough.

asterite · 2020-11-01T16:12:26Z

What I mean is, imagine a C functions expects a non null pointer. And somehow you end up passing nil instead of a pointer. That could happen if, for example, you forgot to assign a variable in an if branch. Then you'll have a bug and the compiler won't detect it. Does that make sense?

HertzDevil · 2020-11-01T16:30:58Z

What I assume is this:

if rand > 0.5
  x = Pointer(UInt8).malloc(16)
end
LibFoo.something(x)

But it's not like bugs like those are exclusive to nilable pointers:

class Foo
  @x = Pointer(UInt8).null

  def f
    if rand > 0.5
      @x = Pointer(UInt8).malloc(16)
    end
    LibFoo.something(@x)
  end
end

Foo.new.f

While I agree that type-level guarantees of non-null pointers are good, they don't exactly prevent Crystal nil from being treated as the null pointer under a C context.

(Such a non-null pointer type can probably be done in pure Crystal actually, and once again it doesn't have to interfere with the semantics of nil.)

asterite · 2020-11-01T16:36:21Z

I think your example of a bit different than what I said. You can always have bugs. But this change prevents more bugs than not having this change.

bcardiff · 2020-11-06T14:59:18Z

I like that it simplifies some compiler's rules and leaves a more strict and predictable behavior on Pointer(T) | Nil types.

But I'm worried about the burden on the C bindings things will push. Maybe a Pointer.unnil(ptr : Pointer(T)?) : Pointer(T) could be added to the std-lib as a convenient method (or some other nicer name of course).

Another thing we will be loosing (silently) when changing nil to Pointer(T).null is that the latter is truthy but the former falsey. Would it make sense to treat Pointer(T).null as falsey?

asterite · 2020-11-06T15:35:05Z

Would it make sense to treat Pointer(T).null as falsey?

It's already falsey. But I regret that "feature", I would only leave nil and false as falsey values, just like Ruby.

I don't understand why there would be a need for Pointer.unnil...? 🤔

HertzDevil · 2020-11-06T15:36:01Z

One difference between Pointer(T).null and nil is that the former yields itself on #try.

For passing nilable types to C (which is also common in e.g. the Qt5 bindings) I ended up writing my own converter. It looks roughly like

def to_lib(x : T) forall T
  {% if T == Nil %}
    Pointer(Void).null
  {% elsif T <= Int::Primitive || ... %} # all extern types
    x
  {% else %}
    case x
    when Nil
      typeof(to_lib(0.unsafe_as(T).not_nil!)).null
    when .responds_to?(:to_unsafe) # except Bool
      x.to_unsafe
    when Int::Primitive, ... # all extern types
      x
    else
      raise ArgumentError.new
    end
  {% end %}
end

It goes one step further than an unnil method, because it handles types like String? directly. For unnil one would still have to do Pointer.unnil(x.try(&.to_unsafe)).

asterite · 2020-11-06T15:38:59Z

If we don't care about nil being passed to C accidentally, then this is all about adding a rule to the compiler to call to_unsafe on the non-nil type, letting the nil part pass a pointer, and then adding a bit of logic to the codegen for this.

bcardiff · 2020-11-06T15:59:51Z

It's already falsey

My bad. Well, a benefit is that it will be a safer transition. Note: I don't even like nil to be a falsey value 🙊 ... but I think that discussion should come later and allow the compiler to warn about a potential change of semantics before doing it.

I don't understand why there would be a need for Pointer.unnil...? 🤔

To simplify the ternary operators for nil management here and there as @jhass pointed out.

struct Pointer
  def self.unnil(p)
    case p
    when Nil
      Pointer(T).null
    when Pointer
      p
    else
      p.to_unsafe
    end
  end
end

pp! Pointer(UInt8).unnil(nil)                      # => Pointer(UInt8).null
pp! Pointer(UInt8).unnil(Pointer(UInt8).null)      # => Pointer(UInt8).null
pp! Pointer(UInt8).unnil(Pointer(UInt8).malloc(1)) # => Pointer(UInt8)@0x108420e70
pp! Pointer(UInt8).unnil("hey")                    # => Pointer(UInt8)@0x107fc8a0c

asterite · 2020-11-06T16:05:46Z

Oh, I guess I would call it Pointer(T).to_unsafe... though it seems it applies to things other than pointers.

asterite · 2020-12-08T12:24:57Z

Coming back to this issue:

The existing semantic of this is broken and this PR fixes it
Code that would mix nil with Pointer(T), forming a union and passed to a C function would now not compile. You will need to pass Pointer(T).nullinstead ofnil. **However**, I think that's actually a good breaking change. You might accidentally mix Pointer(T)withnil` and not notice you are passing a null pointer. This makes it more explicit (explicit is always better than implicit)
Passing an explicit nil to a C function that expects a pointer still works
Passing T | Nil to a C function expecting a pointer, where you would expect to_unsafe to be called or null to be passed didn't work before this PR and doesn't work after this PR, so that discussion belongs to a separate issue

In summary:

this fixes a bug
this makes it harder to get nil be silently passed to a C function

straight-shoota

Sounds good to me 👍

Should we deprecate this first, though?

straight-shoota · 2020-12-08T13:07:03Z

src/openssl/ssl/hostname_validation.cr

-          addr6 = uninitialized LibC::In6Addr
-          if LibC.inet_pton(LibC::AF_INET6, hostname, pointerof(addr6).as(Void*)) > 0
-            return Result::MatchFound if addr6.unsafe_as(StaticArray(UInt32, 4)) == data.as(StaticArray(UInt32, 4)*).value
+    begin


Why this change? Is it related?

san_names was nil, without this change it doesn't compile anymore

asterite · 2020-12-08T13:11:28Z

How can this be deprecated? It's a language change.

straight-shoota · 2020-12-08T13:16:27Z

Instead of removing the logic entirely, the compiler could issue a warning when a NilablePointerType is used.

asterite · 2020-12-08T13:34:51Z

I see...

Could be, I don't know. I don't think this is such a large breaking change

bcardiff · 2021-01-18T19:36:03Z

I would like input from @jhass, if possible, regarding this issue before merging it or not for 0.36. Have you checked how much it affects the bindings you are used to?

jhass · 2021-01-18T19:39:49Z

I haven't checked for the moment, but if you're talking about crystal-gobject, it's a generator so whatever change shouldn't be too much effort.

stakach · 2021-01-28T08:00:13Z

this did require a few patches on our end but in doing so highlighted a bug where we were passing a nil-able object to a c-function expecting a string. So a worthwhile breaking change I think, thanks @asterite

asterite added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler breaking-change labels Nov 1, 2020

Compiler: remove the special logic of union of pointer and nil

c0a3a08

asterite force-pushed the bug/remove-nilable-pointer-rule branch from 72c866a to c0a3a08 Compare November 1, 2020 12:35

bcardiff added topic:compiler:semantic and removed topic:compiler labels Nov 6, 2020

Sija mentioned this pull request Nov 7, 2020

Run GHA CI only on push or pull request, but not both #9890

Closed

straight-shoota approved these changes Dec 8, 2020

View reviewed changes

straight-shoota added this to the 0.36.0 milestone Jan 17, 2021

bcardiff merged commit 0f91a5a into master Jan 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compiler: remove the special logic of union of pointer and nil #9872

Compiler: remove the special logic of union of pointer and nil #9872

asterite commented Nov 1, 2020

jhass commented Nov 1, 2020

jhass commented Nov 1, 2020

asterite commented Nov 1, 2020

jhass commented Nov 1, 2020

asterite commented Nov 1, 2020

asterite commented Nov 1, 2020

jhass commented Nov 1, 2020

asterite commented Nov 1, 2020

HertzDevil commented Nov 1, 2020

asterite commented Nov 1, 2020

HertzDevil commented Nov 1, 2020

asterite commented Nov 1, 2020

HertzDevil commented Nov 1, 2020 •

edited

Loading

asterite commented Nov 1, 2020

bcardiff commented Nov 6, 2020

asterite commented Nov 6, 2020

HertzDevil commented Nov 6, 2020 •

edited

Loading

asterite commented Nov 6, 2020

bcardiff commented Nov 6, 2020

asterite commented Nov 6, 2020

asterite commented Dec 8, 2020

straight-shoota left a comment

straight-shoota Dec 8, 2020

asterite Dec 8, 2020

asterite commented Dec 8, 2020

straight-shoota commented Dec 8, 2020

asterite commented Dec 8, 2020

bcardiff commented Jan 18, 2021

jhass commented Jan 18, 2021

stakach commented Jan 28, 2021

Compiler: remove the special logic of union of pointer and nil #9872

Compiler: remove the special logic of union of pointer and nil #9872

Conversation

asterite commented Nov 1, 2020

jhass commented Nov 1, 2020

jhass commented Nov 1, 2020

asterite commented Nov 1, 2020

jhass commented Nov 1, 2020

asterite commented Nov 1, 2020

asterite commented Nov 1, 2020

jhass commented Nov 1, 2020

asterite commented Nov 1, 2020

HertzDevil commented Nov 1, 2020

asterite commented Nov 1, 2020

HertzDevil commented Nov 1, 2020

asterite commented Nov 1, 2020

HertzDevil commented Nov 1, 2020 • edited Loading

asterite commented Nov 1, 2020

bcardiff commented Nov 6, 2020

asterite commented Nov 6, 2020

HertzDevil commented Nov 6, 2020 • edited Loading

asterite commented Nov 6, 2020

bcardiff commented Nov 6, 2020

asterite commented Nov 6, 2020

asterite commented Dec 8, 2020

straight-shoota left a comment

Choose a reason for hiding this comment

straight-shoota Dec 8, 2020

Choose a reason for hiding this comment

asterite Dec 8, 2020

Choose a reason for hiding this comment

asterite commented Dec 8, 2020

straight-shoota commented Dec 8, 2020

asterite commented Dec 8, 2020

bcardiff commented Jan 18, 2021

jhass commented Jan 18, 2021

stakach commented Jan 28, 2021

HertzDevil commented Nov 1, 2020 •

edited

Loading

HertzDevil commented Nov 6, 2020 •

edited

Loading