Convention, deinit doesn't invalidate #6322

marler8997 · 2020-09-11T18:52:20Z

This is a variation of #6299 that @andrewrk suggested be a separate proposal.

Proposal

This issue proposes that by convention, deinit never invalidates memory and always takes self as a value. So instead of this:

pub fn deinit(self: *@This()) void {
    // release resources
    self.* = undefined;
}

we use this:

pub fn deinit(self: @This()) void {
    // release resources
}

Furthermore, in the cases where invalidation is more than just self.* = undefined, it's proposed that a type can define a function named invalidate to contain type-specific invalidation logic, i.e.

pub fn invalidate(self: *@This()) void {
    self.file_open = false;
}

Why?

The reasoning for this convention appeals to the Single Responsibility Principle. Currently in the standard library there is a convention that deinit perform 2 operations:

release resources
invalidate memory

pub fn deinit(self: *@This()) void {
    self.foo.deinit(); // example of releasing resources
    // ...
    self.* = undefined; // invalidate memory
}

However, there are use cases where code may want to release resources, but not invalidate memory. Since deinit always does both, programs may not be able to use the struct in the way that makes sense in their situation. For example, this convention prevents programs calling deinit on const instances. Say we have a Page type that wraps a page address:

const Page = struct {
    ptr: *u8,
    pub fn deinit(self: *@This()) void {
        std.os.unmap(self.ptr);
        self.* = undefined;
    }
};

Because deinit invalidates memory, it must take a mutable reference. This prevents a program from being able to call deinit on a const version of Page:

const page = try Page.init();
defer page.deinit(); // compile error

For nested types, this can also cause the same memory to be invalidated multiple times when a type calls deinit on its members but also invalidate its own memory with self.* = undefined. Foo contains Bar, which contains Baz, which contains Buz, all of which call deinit on all their members and invalidate themselves, this would mean Foo would invalidate Buz 4 times.

You will also find that throughout the standard library, some types choose to invalidate memory inside deinit while other do not. These variations are likely caused by differing programmer sensibilities and different cost/benefit analyses for each case. The issue here is that the type definition itself doesn't have the full picture, only the user will. By establishing that types never force invalidation in deinit, this puts the "responsibility" for deciding whether to invalidate on the party with the all the information, the user. This is the same reasoning that makes Zig's allocators work so well, it puts the responsibility of deciding an allocation strategy on the user, not the functions that only need to allocate memory.

Note that some types in the standard library have already addressed these issues. For example HashMap defines the deallocate function, StreamServer defines the close function, which both release resources without invalidating. This proposal aims to help developers do the "right" thing by default, by allowing the user to make this decision insteading of forcing one on them. By establishing that deinit should always take self as a value type, it keeps invalidation and resource deallocation separate.

What about Invalidation?

In the common case that a user wants to release resources and invalidate memory, a common function can be implemented to do both, i.e.

pub fn destroy(o: var) void {
    o.deinit();
    if (@hasDecl(o, "invalidate")) {
        o.invalidate();
    } else {
        o.* = undefined;
    }
}

Usage:

var foo = Foo.init();
defer std.destroy(&foo);

Some may not like that this function is no longer a member function. If this is the case, we could also define it like this:

pub fn defineDestroy(comptime T: type) type {
    return struct {
        pub fn destroy(self: *T) void {
            self.deinit();
            if (@hasDecl(o, "invalidate")) {
                self.invalidate();
            } else {
                self.* = undefined;
            }
        }
    };
}


pub const Foo = struct {
    pub usingnamespace defineDestroy(@This());
};

Usage:

var foo = Foo.init();
defer foo.destroy();

The text was updated successfully, but these errors were encountered:

Tetralux · 2020-09-11T19:49:21Z

It's worth noting that init acquires those resources, and thus, you might expect deinit to release them.
I'd also consider destroy over deinitAndInvalidate.

marler8997 · 2020-09-12T00:21:01Z

@Tetralux I like the term destroy. I'm going to ammend the proposal to use that instead.

eriksik2 · 2020-09-13T17:48:23Z

should a const reference to a resource be able to release it?

marler8997 · 2020-09-14T16:01:08Z

should a const reference to a resource be able to release it?

This proposal suggests that all deinit functions (that release resources) take self by value, which is functionally the same as a const reference. The only real difference between the two is that a const reference forces the compiler to pass it by reference, whereas passing it by value allows the compiler to decide whether it should be passed by value or reference under the hood.

This follows the convention described here: ziglang/zig#6322.

andrewrk · 2021-11-23T20:35:08Z

I'm happy with deinit doing invalidation. There can be otherly named methods that do things such as reset or clear.

marler8997 · 2021-11-23T20:50:01Z

I'm happy with deinit doing invalidation.

There are already types that have a way to free resources without invalidation. For example, HashMap has deallocate which will release resources without invalidating. Why does HashMap support this case but you've decided not to support this for other types? If you're happy without this separation, are you happy with removing it from HashMap?

andrewrk · 2021-11-23T20:53:04Z

HashMap has

        pub fn deinit(self: *Self, allocator: *Allocator) void {
            self.deallocate(allocator);
            self.* = undefined;
        }

so it is doing the convention that deinit invalidates. I'm confused what you're trying to communicate.

marler8997 · 2021-11-23T20:55:44Z

The proposal is to establish a convention that allows you to "cleanup type resources" without invalidating them. HashMap has done this by adding a deallocate function. If everything had a deallocate function like HashMap then this proposal would be satisfied. Are you ok with all the types in std using this interface? If not, then why does HashMap provide a way to do this while others do not?

andrewrk · 2021-11-23T20:57:28Z

HashMap does not have a deallocate function. It is private.

Also the deallocate function does invalidate the memory, it just does not additionally have the safety of setting to undefined. If you tried to use the memory after deallocate, it would be just as broken as if the memory had been set to undefined, just slightly harder to debug.

marler8997 · 2021-11-23T21:02:57Z

HashMap does not have a deallocate function. It is private.

Oh shit I didn't realize that. Well then at least you're consistent, wrong in my opinion, but consistent :)

SpexGuy · 2021-11-23T21:29:23Z

From an api perspective, a freeing function must do one of two things:

free memory and leave the object in an unusable state (deinit).
free memory and leave the object in a reusable state (clear).

From a safety perspective, a freeing function of the first form should always leave pointers as undefined. There is no use for a freeing function without this safety on a public api, except as an optimization. SRP doesn't really apply.
As an optimization, this isn't a particularly good one. It saves repeated writing of a few words to cached memory when in debug mode, which is pretty cheap. If this is important, libraries can fix it pretty easily:

const MyType = struct {
    x: std.ArrayList(u32),
    y: std.AutoHashMap(u32, u32),
    z: u32,
    pub fn deinit(self: *@This()) void {
        self.x.deinit();
        self.y.deinit();
        self.z = undefined;
    }
};

Additionally, writing undefined communicates additional information that enables optimizations. So removing the assignment may be a pessimization.

The only remaining potential downside of this api is that it requires a pointer to mutable memory. However, this is not really a problem in practice.

Data structures which have a meaningful deinit() must be using some sort of runtime init(), so the underlying memory will not be a static constant.
Data structures which are intended to be used as constant data (like your Page example) do not need to implement this api this way.
In the extreme case, const can be cast away or the value can be copied into mutable memory.

In the specific cases we're looking at, like ArrayList and HashMap, an immutable one is useless because you can't modify the len field. So there's no need for the deinitializer to operate on const data.

marler8997 · 2021-11-23T21:33:41Z

Thanks for the very thorough analysis/explanation here @SpexGuy. I concede.

marler8997 · 2021-11-23T22:17:14Z

There's one more thing I thought of. As an example, I use "const file" alot. Currently std.fs.File doesn't invalidate (but is supposed to change) so I will have to use "var file" from now on, do you think that could have some downsides in optimization? I've had many similar cases that will no longer be able to use const if I start invalidating in all my of deinit functions. How sure are we that this is the right direction?

SpexGuy · 2021-11-23T22:28:39Z

This depends a lot on what const x = T.init(); actually means for structs. Because of RLS, x is definitely mutable for part of its lifetime. Whether x has immutable provenance is kind of up for debate at this point, especially once the result location becomes accessible through #2765. But even assuming it does, in the case of std.fs.file that really doesn't mean much. That provenance is probably not invalidated often, and if this is really important for a specific optimization you can copy the file into a const temporary, use that, and then deinit the original. So I think the safety wins out here in my mind. I'm not really worried that this is going to significantly inhibit the optimizer in meaningful ways.

That said, I think we can take this on a case by case basis. If std.fs.File is intended to be used as a const, then deinit shouldn't invalidate. Whether a type is meant to be stored in const is part of the api design of that type, so deinit can be adjusted accordingly.

To be clear, the decision here isn't "everything should invalidate", it's just "invalidation should be preferred, but decided on a case by case basis".

daurnimator · 2021-11-23T22:36:31Z

If std.fs.File is intended to be used as a const, then deinit shouldn't invalidate. Whether a type is meant to be stored in const is part of the api design of that type, so deinit can be adjusted accordingly.

To be clear, the decision here isn't "everything should invalidate", it's just "invalidation should be preferred, but decided on a case by case basis".

The problem I've found is that non-constness proliferates.
--> I'll add a new field to my struct (e.g. a AutoHashMap), and because it's .deinit() is non-const, now my .deinit() needs to be non-const; and users of my struct all need to convert to var.. which potentially has API issues for them.
I think this happens often enough that we should never invalidate.

Off the cuff idea: what if assigning undefined was allowed to const fields? It wouldn't actually write 0xaaaa, but it could still result in e.g. valgrind annotations that the memory area is undefined.

SpexGuy · 2021-11-23T22:44:13Z

The problem I've found is that non-constness proliferates.

Yeah, this phenomenon is sometimes uncharitably called "const poisoning", and it's one reason that extending #224 to pointers is a really bad idea. However, unlike languages like C++ or D where const is transitive through pointers/structures, in Zig it only applies to specific memory regions. So these changes can only proliferate through value inclusions, which limits their scope.

But also, the only real objection I'm seeing here is about backwards compatibility, which is not really a reason yet for std lib code. None of it is stable.

Vexu added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Sep 12, 2020

Vexu added this to the 0.8.0 milestone Sep 12, 2020

andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021

ibokuri pushed a commit to getty-zig/json that referenced this issue Oct 23, 2021

de: Add destroy method to Deserializer

0a61b7e

This follows the convention described here: ziglang/zig#6322.

andrewrk closed this as completed Nov 23, 2021

andrewrk added the standard library This issue involves writing Zig code for the standard library. label Nov 23, 2021

marler8997 mentioned this issue Dec 23, 2021

Constness inconsistencies in the standard library #9814

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convention, deinit doesn't invalidate #6322

Convention, deinit doesn't invalidate #6322

marler8997 commented Sep 11, 2020 •

edited

Loading

Tetralux commented Sep 11, 2020

marler8997 commented Sep 12, 2020

eriksik2 commented Sep 13, 2020

marler8997 commented Sep 14, 2020 •

edited

Loading

andrewrk commented Nov 23, 2021 •

edited

Loading

marler8997 commented Nov 23, 2021 •

edited

Loading

andrewrk commented Nov 23, 2021

marler8997 commented Nov 23, 2021 •

edited

Loading

andrewrk commented Nov 23, 2021

marler8997 commented Nov 23, 2021

SpexGuy commented Nov 23, 2021

marler8997 commented Nov 23, 2021

marler8997 commented Nov 23, 2021

SpexGuy commented Nov 23, 2021 •

edited

Loading

daurnimator commented Nov 23, 2021

SpexGuy commented Nov 23, 2021

Convention, deinit doesn't invalidate #6322

Convention, deinit doesn't invalidate #6322

Comments

marler8997 commented Sep 11, 2020 • edited Loading

Proposal

Why?

What about Invalidation?

Tetralux commented Sep 11, 2020

marler8997 commented Sep 12, 2020

eriksik2 commented Sep 13, 2020

marler8997 commented Sep 14, 2020 • edited Loading

andrewrk commented Nov 23, 2021 • edited Loading

marler8997 commented Nov 23, 2021 • edited Loading

andrewrk commented Nov 23, 2021

marler8997 commented Nov 23, 2021 • edited Loading

andrewrk commented Nov 23, 2021

marler8997 commented Nov 23, 2021

SpexGuy commented Nov 23, 2021

marler8997 commented Nov 23, 2021

marler8997 commented Nov 23, 2021

SpexGuy commented Nov 23, 2021 • edited Loading

daurnimator commented Nov 23, 2021

SpexGuy commented Nov 23, 2021

marler8997 commented Sep 11, 2020 •

edited

Loading

marler8997 commented Sep 14, 2020 •

edited

Loading

andrewrk commented Nov 23, 2021 •

edited

Loading

marler8997 commented Nov 23, 2021 •

edited

Loading

marler8997 commented Nov 23, 2021 •

edited

Loading

SpexGuy commented Nov 23, 2021 •

edited

Loading