Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor allocations that include multiple structures #828

Open
krader1961 opened this issue Sep 3, 2018 · 3 comments
Open

Refactor allocations that include multiple structures #828

krader1961 opened this issue Sep 3, 2018 · 3 comments

Comments

@krader1961
Copy link
Contributor

See issue #824 and PR #827 for background. The core problem is that there are multiple places in the code which allocate a single buffer that is carved up into individual structures which point to each other. This leads to allocations like this (from src/cmd/ksh93/sh/args.c):

malloc(sizeof(struct Sort) + strlen(keylist) + (sizeof(char *) + 1) * keys +
       (arp->nelem + 1) * (nodesize + sizeof(void *)) + c + 3);

I challenge you to tell me whether that allocation is the correct size or not. Show your work.

Hint: It was definitely not correct prior to PR #827 because nodesize might have been too small. And even after that change I can't tell whether the size is correct. In no small part because of subexpressions like the trailing + 3. Why do three extra bytes need to be included in the allocation? What is the purpose of the + 1 in subexpression (sizeof(char *) + 1) * keys?

@DavidMorano
Copy link

malloc(sizeof(struct Sort) + strlen(keylist) + (sizeof(char *) + 1) * keys +
(arp->nelem + 1) * (nodesize + sizeof(void *)) + c + 3);

I see this kind of code (and worse) everyday! I will not name names in order to protect the guilty. But yes, this is extremely bad programming practice, and should be refactored when possible.

@krader1961
Copy link
Contributor Author

krader1961 commented Sep 3, 2018

FWIW, I have no objection to patterns like this:

struct {
    int x;
    int y;
    char *ptrs[0];
} a_struct;

int nptrs = 3;
struct a_struct *z = malloc(sizeof(struct a_struct) + sizeof(char *) * nptrs);

That's a commonly used idiom that is easy to understand and verify is correct.

P.S., When I asked earlier what the + 3 term was for I was being a bit disingenuous. There are two strings being embedded in the buffer so two of those three bytes are for the terminating null character of each string. That still leaves the third byte unexplained as well as the other places + 1 appears.

@krader1961
Copy link
Contributor Author

By the way, the pattern I'm arguing should be eliminated also makes tools like ASAN (address sanitizer) much less useful. That's because the buffers are contiguous which limits the ability of tools like ASAN to detect when a buffer is accessed beyond its boundaries. Which results in false negatives (aka silent bugs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants