Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set old names as labels in clean_names #563

Open
jospueyo opened this issue Jan 21, 2024 · 6 comments
Open

Set old names as labels in clean_names #563

jospueyo opened this issue Jan 21, 2024 · 6 comments

Comments

@jospueyo
Copy link

Today I engaged in this thread in mastodon who led me to read this blog about data labels in R. I thought that would be nice that clean_names() had the option to keep the old names as labels in the dataframe.

I think that would not be too complicated. I'm keen to propose a PR if I you were willing to accept the feature. It would be something like this...

if (set_labels){
    old_names <- names(dat)
    purrr::walk(1:ncol(dat), \(i) attr(dat[[i]], "label") <<- old_names[[i]])
}

names(dat) <- make_clean_names(names(dat))

 

If you don't like using superassignment, this can also be done in a for loop.

@billdenney
Copy link
Collaborator

@jospueyo , thanks for your interest in the package. And, I understand the value of keeping the original name associated with the column.

I like that it seems you would make including the labels as optional with the set_labels argument. I also like that you accurately predicted that I would not like super-assignment. 😄

So, I think that PR could be useful. As with any PR:

  • Please make the default backwards compatible (so set_labels would be FALSE. I, too, have had issues with labels messing things up in the past, as mentioned on the Mastodon thread.
  • Please be sure to test for the changes
  • please add it to the News file
  • please add yourself as a contributor

@sfirke
Copy link
Owner

sfirke commented Jan 21, 2024

Sounds good to me too. After this is introduced I could see having tabyl() use labels where available, but that's unrelated (in code at least) to this suggestion.

jospueyo pushed a commit to jospueyo/janitor that referenced this issue Jan 21, 2024
jospueyo pushed a commit to jospueyo/janitor that referenced this issue Jan 21, 2024
jospueyo pushed a commit to jospueyo/janitor that referenced this issue Jan 21, 2024
@sfirke
Copy link
Owner

sfirke commented Jan 25, 2024

Looking at that Masto thread, some people talk about wanting to restore the original names at some point. Do we know if there's a function in another labelling package that transfers the labels into the column names? Kind of undoing the clean_names call.

It seems like a potentially simple function, but (a) I hope someone else already wrote a good version we can point to (b) I don't want to overcomplicate things and there would be some thinking needed around say, what if only some of the original columns are present.

@billdenney
Copy link
Collaborator

Something like dirty_names()? 🤣 (I don't know if it already exists.)

My experience with attributes like this is that they're lost easily. Creating a reverse function (like dirty_names() or likely better label_to_names()) would likely end with a lot of questions. That doesn't mean we shouldn't create it, but we should have some very good documentation before including it in janitor.

@sfirke
Copy link
Owner

sfirke commented Jan 25, 2024

Looks to me like sjlabelled::label_to_colnames might do it, what do you think?

@billdenney
Copy link
Collaborator

billdenney commented Jan 25, 2024

@sfirke, that looks like the right solution to me. We should refer to it in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants