-
-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 17912: Add function for creating a temporary file. #5788
base: master
Are you sure you want to change the base?
Conversation
e109807
to
fd9a436
Compare
Great. By some miracle, Windows went perfectly on the first go, but then Mac OS X failed for some reason. :| |
It passed Mac OS X this time though, so I have no clue what happened. In theory, it should act the same on Mac OS X as the other POSIX platforms. |
std/file.d
Outdated
return filename; | ||
} | ||
|
||
throw new FileException("Failed to create a temporary file"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Throws
ddoc section should be added for this exception.
Why not Why not |
Because the last time I looked into it, there were some POSIX systems that could only generate something like 26 file names with mkstemp. I don't think that the ones dmd supports at the moment have that problem, but others did, which then potentially becomes a problem for LDC or GDC, and I see no advantage to using mkstemp over just doing it ourselves when some systems do a terrible job with mkstemp. The code used here was in Phobos previously with
It doesn't support suffixes (which IMHO is more important than a prefix, since it can be used for file extensions, which sometimes matter a great deal; prefixes matter a lot less frequently). It also doesn't let you write to the file when it's created. You have to reopen it and then write to it. Also, the docs themselves seem to think that you shouldn't use it, saying "To avoid problems resulting when converting an ANSI string, an application should call the CreateFile function to create a temporary file." So, GetTempFile can't be used unless we don't have a suffix or if we call rename on the function afterwards (which is likely a bad idea), and Microsoft's docs are basically saying to do what I did. So, I'd just as soon avoid mkstemp regardless of what Windows is up to, but it looks like we need to implement this ourselves on Windows anyway, and this way, what it's doing should be consistent across systems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to compare this implementation / draw inspiration from existing mkstemp implementations.
std/file.d
Outdated
|
||
// Once should be enough given the randomness, but we'll give it three | ||
// times to be safe. | ||
foreach (i; 0 .. 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing things "just to be safe" smells - this should be an infinite loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of infinite loops when you can't absolutely guarantee that it will exit, which we can't here, because it depends on the file system - but in theory, it should be essentially impossible for the loop to be infinite if we're picky about the error code like you suggested.
std/file.d
Outdated
auto fd = CreateFileW(tempCString!FSChar(filename), GENERIC_WRITE, 0, null, CREATE_NEW, | ||
FILE_ATTRIBUTE_NORMAL | FILE_FLAG_SEQUENTIAL_SCAN, HANDLE.init); | ||
if (fd == INVALID_HANDLE_VALUE) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop should be unconditionally retried on EEXIST or equivalent. Other errors, like ENOPERM or ENOENT (temporary directory inaccessible on nonexisting) should throw immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
I don't think we should be using the global Edit: Another reason not to use |
What are they?
There's a very good chance that we get things wrong too - definitely some things that might be worth worrying about in the initial implementation here. Generally I don't think D is expected to fix or work around bugs in ancient/unsupported libcs.
Yeah, on closer look, GetTempFileName looks not that great. You have to specify the random value anyway, and it's limited to 65535 unique file names (with the same prefix). |
876c720
to
99aec4e
Compare
Okay, per the review comments, it now is pickier about which errors it retries for, and it's not using
I think that it was Solaris that was the problem, but it's been quite a while since I implemented But given that we need to implement this ourselves on Windows anyway, I'd just as soon use our implementation on all systems, and then if there's a problem, it will be caught more easily. The key thing to getting it right is how the file is opened (it needs to make sure that the file is created when it's opened and can't do that by checking for existence first, or there's a race condition); the rest is mostly just a question of having a suitably random name, and unless we have serious problems with our random number generator, that shouldn't be a problem (and at the moment, this picks 15 random alphanumeric characters, so the number of possible values is huge). |
This is very nice, in that it frees us from having to rely on potentially poorly-implemented However, it warrants very close scrutiny, because when it comes to security, one can never be too paranoid. As far as POSIX is concerned, I think calling But I think reviewing it against various |
Oh, one potential thing: this function ought to be thread-safe. I believe it is (the static RNG seed should be in TLS, right?), but someone should review it carefully just in case. |
If you can think of a potential security hole, then great, but all this does is generate a random file name and then ensure that when it creates the file, that file doesn't already exist. Ensuring that it doesn't overwrite any files solves the key security problem. I don't see how it would be possible to do anything else to protect against security problems, because there's so little involved here. Certainly, if there's a potential security problem with creating a file, closing it, and reopening it, then there could be a problem, but there's nothing that we can do about that. If that's your problem, then go use
Of course, it's thread-safe. Nothing in it is |
798f714
to
f299c01
Compare
@quickfur If it makes you feel any better, I just looked over FreeBSD's implementation of |
8a50e0f
to
4590b9d
Compare
FWIW I approve the API addition. Please work out the controversy in the implementation. |
AFAICT the 'controversy' was put to rest on January 26. @CyberShadow was satisfied with this resolution:
|
|
||
{ | ||
auto path = createTempFile(); | ||
scope(exit) if (path.exists) remove(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why if (path.exists)
. Should this either be scope(exit) assert(path.exists), remove(path);
or scope(exit) remove(path);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path.exists
portion could be removed, but I pretty much always use it before removing a file so that I don't have to worry about remove
throwing due to something that happened when the code is changed later or something else unforeseen goes wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If tempFile
does fail to create the file for some reason, then one of the tests after the scope statement will fail, and the AssertError
will bubble up properly, but if the test isn't there, then in that case, we'd then get a FileException
being thrown instead when remove
failed. So, IMHO, it is better to have the check even if it's useless if everything is working correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I've run into this before, and found myself wishing for an option for remove
to ignore file-not-found errors. The last time I had to deal with this, I ended up calling core.sys.posix.unistd.remove
directly instead. Unfortunately that introduces OS dependence.
Perhaps this could be filed as an enhancement request? Anybody interested in that?
Ah. Great. It looks like someone committed a change to std.random which breaks this PR. |
@n8sh I agree, I think we should just merge this once the autotester failure has been fixed. A large part of the discussion in this PR is just Yet Another Case of the perfect being the enemy of the good. Surely we can merge this first and then fix it later if it's really that necessary. My suspicion is that nobody will actually care (and those who are paranoid enough to care will probably just write their own version of the function anyway, instead of relying on Phobos). |
The breakage was caused by this PR, which is going to need to be reverted. |
This adds std.file.createTempFile, which creates a temporary file with a random name (optionally with a specified prefix and/or suffix). By default, the file is empty, but if data is passed to createTempFile, then the file is populated with that data just like it would have been with std.file.write. The name of the file that was created is returned. So, essentially, createTempFile is like write except that it generates the file name for you and returns it. Previously, we added scratchFile to std.stdio, which was like createTempFile except that it gave you an open File object, but it was scrapped, because the dependencies that it added to std.stdio made "hello world" bigger. See: https://issues.dlang.org/show_bug.cgi?id=14599 Just like scratchFile did, createTempFile ensures that a filename is selected which did not exist previously so that the file can't be hijacked with different permissions by someone creating a file with the same name beforehand (the randomness of the name should make that effectively impossible, but the way that createTempFile opens the file guarantees it). These changes also consolidate some of write's implementation across OSes, since the writing portion of writeImpl was esentially identical across OSes but needlessly duplicated. That consolidated functionality is then also used by createTempFile.
Okay. It looks like this is passing again now that the breaking change to std.random.choice has been reverted, and in theory, the controversy with the random number generator was solved by adding a note to tho documentation about it. But someone still has to actually give an approving review for the PR. |
} | ||
cenforce(sum == size, name, namez); | ||
} | ||
cenforce(core.sys.posix.unistd.close(fd) == 0, name, namez); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General note, separating related refactorings into their own commit makes reviews easier and faster
@@ -809,8 +810,6 @@ if (isConvertibleToString!R) | |||
static assert(__traits(compiles, append(TestAliasedString("foo"), [0, 1, 2, 3]))); | |||
} | |||
|
|||
// Posix implementation helper for write and append | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
cnt = (size - sum < 2^^30) ? (size - sum) : 2^^30; | ||
WriteFile(h, buffer.ptr + sum, cast(uint) cnt, &numwritten, null); | ||
if (numwritten != cnt) | ||
cnt = size - sum < 2^^30 ? size - sum : 2^^30; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? The parens were redundant but they were clearly there for a reason (readability).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, mixing stylistic changes, refactorings, and adding new code, all in one commit, is pretty bad. Please avoid doing this in the future.
of the file. If data is passed in, then the file is written with that data; | ||
otherwise, it's empty. | ||
|
||
The idea is that this creates a temporary file for testing or whatever |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bump, this wasn't addressed.
$(REF rndGen, std, random) is used as the random number generator. | ||
|
||
Params: | ||
prefix = Prefix to the random portion of the file name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a directory path? In some terminology (e.g. autotools) "prefix" is implied to mean a path prefix. Please clarify.
return buildPath(tempDir, name); | ||
} | ||
|
||
while (1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using 1
as true
is a C-ism (which does not have a bool
type). (Since D has a bool
type, numbers should be used to indicate quantity, and 1 here does not indicate the quantity of anything.) Please use while (true)
in D.
} | ||
} | ||
|
||
/// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fine unit test, but not a good example. Would be nice to have a human-friendly example alongside this unit test.
else | ||
static assert(0, "Unsupported OS"); | ||
|
||
writeToOpenFile(fd, buffer, filename, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is returning without closing the file descriptor okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! This is a raw OS handle, not a std.stdio.File
, so there is no refcounting or GC or anything of that sort. So unless I'm missing something obvious, yes we will need to close the file descriptor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking at the code, writeToOpenFile
closes the file descriptor so that the code that involves closing the file descriptor is shared with writeImpl
just like the code to actually write to the open file is. I guess that I should have named it in a way that made it clear.
Bump. Maybe move this to |
We've never done that, and I see no reason to start now. It would just result in quickly breaking the code of anyone who actually used it. I just need to address the last few comments, and it should be good to go. Hopefully, I can get to it this weekend. |
BTW, why does this close the file and return only its name rather than returning a |
std.file does not use And personally, for basically every use case that I would have for a function like this, returning a |
@jmdavis got it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to figure what this has over
auto tmpName = tempDir.buildPath("my_" ~ uniformAlphaNumerics(15) ~ ".txt");
and came up with the following reasons:
- We don't have uniformAlphaNumerics. Would be great to have a convenient way to generate random strings, it's useful in many other situations.
- In the exceedingly rare case when the file name actually exists, this PR tries again automatically.
These are worthy, but on the other hand this PR has 178 lines and 178 lines are quite a lot more than 1 line. In particular, the entire notion of writing to the created file is a waste. Why not let the user write a second line that does just that? We have well-tested functions that do that kind of stuff. (The fact that this PR adds its own bugs whilst attempting to reuse code is telling.)
What I think would be good engineering: add a method makeTemporary(const(char)[] prefix = null, const(char)[] suffix = null)
to the File
type and have it generate the random name and open the file.
auto tmp = File.makeTemporary("good_", ".job");
... use tmp ...
... tmp gets closed nicely by File's destructor ...
`createTempFile` added to std.file | ||
|
||
$(REF std, file, createTempFile) creates a file in $(REF tempDir, std, file) | ||
with a random name (with an optionally provided prefix and suffix) and returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeated "with", use e.g. s/with an/including an/
$(REF std, file, createTempFile) creates a file in $(REF tempDir, std, file) | ||
with a random name (with an optionally provided prefix and suffix) and returns | ||
the file name. It also optionally takes data to write to the file when creating | ||
it (if no data is provided, then the newly created file will be empty). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's awkward that the file is created, closed, to presumably be opened by the user again right after. Shouldn't this function return e.g. an opened File
? Then people can call name
to get its name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning an open File
would fit in std.stdio, not std.file, because std.file specifically only ever deals with entire files at at time. Also, in most cases, I'd personally find it very annoying to have a File
to deal with. The main purpose is to write data to a file for testing or to pass to another program and then pass the file name to something else to open it. Having an open File
rather than just a filename just makes that use case more annoying. We did have a version of this in std.stdio briefly that did that, but it got removed from std.stdio on the grounds that it increased the size of "hello world" due to issues with the compiler.
The way I would expect to use this normally would be to just call this function like write
and write the whole thing at once, and in the rarer case where they want a File
, they can just not pass any data to it and then reopen it with File
(in which case, since the file was already created, there is no security risk like there is the first time).
and then deletes the file when the $(REF File, std, stdio) is closed. The file | ||
created by $(REF std, file, createTempFile) is a normal file which can be | ||
opened and closed as many times as desired and will only be deleted by the | ||
program if the program explicitly does so (though it is sitting in a temp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/is sitting/is created/
created by $(REF std, file, createTempFile) is a normal file which can be | ||
opened and closed as many times as desired and will only be deleted by the | ||
program if the program explicitly does so (though it is sitting in a temp | ||
directory and is subject to the normal things tha happen to such files - e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"normal things" is too colloquial, please rephrase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "tha"
cenforce(sum == size, name, namez); | ||
} | ||
cenforce(core.sys.posix.unistd.close(fd) == 0, name, namez); | ||
writeToOpenFile(fd, buffer, name, namez); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who closes fd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, writeToOpenFile
does. I suppose that I could rename it to writeToOpenFileAndClose
or something to make it clearer. It's just a private function that was created to refactor some code that writeImpl
uses rather than duplicating it.
|
||
/++ | ||
Creates a file with a random name in $(LREF tempDir) and returns the name | ||
of the file. If data is passed in, then the file is written with that data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/returns the name of the file/returns its name/
Generally the prose could use some editing.
of the file. If data is passed in, then the file is written with that data; | ||
otherwise, it's empty. | ||
|
||
The idea is that this creates a temporary file for testing or whatever |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"whatever" is also odd
be deleted if it's explicitly deleted or if the OS decides to delete it as | ||
some OSes do with files in temp directories on startup or shutdown. So, | ||
unlike $(REF File.tmpfile, std, stdio), the file has a name, and it can have | ||
all of the things done to it that one might typically do with a file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again with "things"...
|
||
The file is created with R/W permissions. On POSIX systems, the permissions | ||
are restricted to the current user, though the effective permissions are | ||
modified by the process' umask in the usual way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be important to allow configurable permissions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important? I don't think so. We don't provide that ability with write
, and that does start getting a bit system-specific to implement. It might be a reasonable enhancement, but if we did something like that, then we'd need to do the same with other functions like write
. So, if we want to do that, I think that it should be left to a separate PR.
See_Also: | ||
$(REF File.tmpfile, std, stdio) | ||
+/ | ||
string createTempFile(const void[] buffer = null) @safe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be an overload. Consider: createTempFile("my_prefix_")
. The user will get the wrong idea. Let them type null
, null
if they don't care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me. I'd expect the normal use case to involve passing data anyway. I suspect that I assumed that folks would find it annoying to have to pass null
in the case where they didn't want to write data, but it's been a while since I wrote this, so I don't remember for sure.
The biggest thing that a function like this does is guarantee that the file is created when it's opened, since if it's not actually created when it's opened, you have a security flaw, since someone could theoretically create the file beforehand with different permissions. And it's not the sort of thing that someone is usually going to think of unless it's been explained to them before. So, a large percentage of folks who would try something like this would try a line like you just suggested and simply open the file with that and have a security flaw in their program. All other details aside, that is the big reason to have a function like this. |
How is this different than |
@Flying-Toast |
What is this missing (apart from the small conflicts in std/file.d) for merging? It seems like most of the work was done by @jmdavis so I am bumping this |
@FraMecca IIRC, it's basically good but needs to be cleaned up a bit (e.g. one of the function names was named badly and was confusing reviewers about whether the file descriptor was being closed properly or not), though I'd have to look over it all again to see what exactly would need to be tweaked. I've just never gotten around to cleaning it up. |
Status? |
This adds std.file.createTempFile, which creates a temporary file with a
random name (optionally with a specified prefix and/or suffix). By
default, the file is empty, but if data is passed to createTempFile,
then the file is populated with that data just like it would have been
with std.file.write. The name of the file that was created is returned.
So, essentially, createTempFile is like write except that it generates
the file name for you and returns it.
Previously, we added scratchFile to std.stdio, which was like
createTempFile except that it gave you an open File object, but it was
scrapped, because the dependencies that it added to std.stdio made
"hello world" bigger. See:
https://issues.dlang.org/show_bug.cgi?id=14599
Just like scratchFile did, createTempFile ensures that a filename is
selected which did not exist previously so that the file can't be
hijacked with different permissions by someone creating a file with the
same name beforehand (the randomness of the name should make that
effectively impossible, but the way that createTempFile opens the file
guarantees it).
These changes also consolidate some of write's implementation across
OSes, since the writing portion of writeImpl was esentially identical
across OSes but needlessly duplicated. That consolidated functionality
is then also used by createTempFile.
Unfortunately, I don't currently have a Windows setup that I can test this with, so depending on what happens with the autotester, I may need to make a revision or two to get it to pass on Windows. I might get lucky though.