Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upkeep proposition / spring cleaning #549

Closed
7 of 9 tasks
olivroy opened this issue Jun 13, 2023 · 10 comments
Closed
7 of 9 tasks

Upkeep proposition / spring cleaning #549

olivroy opened this issue Jun 13, 2023 · 10 comments

Comments

@olivroy
Copy link
Contributor

olivroy commented Jun 13, 2023

Hi,

I would have a couple suggestions to improve janitor

Let me know what you think and I can open PRs for the things you think are relevant!

Edit:

@billdenney
Copy link
Collaborator

Thanks for this help!

I like all of this except for the snapshot tests. I don't prefer snapshot tests because they don't fail on CRAN. I prefer explicit tests for messages with regular expressions or specific examples so that we always know that we are testing exactly what we intend to test.

For the error messages, I'd prefer using rlang rather than cli since we already depend on rlang (even though I like the cli visuals better).

I definitely think we should remove the superseded function use, if reasonable. We will need to be careful of functional changes when switching to the new functions. janitor is downloaded >100k times/month, so breaking changes should be minimized at (almost) all costs.

@sfirke, what are your thoughts?

@olivroy
Copy link
Contributor Author

olivroy commented Jun 13, 2023

I like all of this except for the snapshot tests. I don't prefer snapshot tests because they don't fail on CRAN. I prefer explicit tests for messages with regular expressions or specific examples so that we always know that we are testing exactly what we intend to test.

Good! I will try opening PRs when #548 is merged to avoid unnecessary conflicts as much as possible!

For the error messages, I'd prefer using rlang rather than cli since we already depend on rlang (even though I like the cli visuals better).

Since dplyr already imports cli, it wouldn't add a dependency to janitor (although it would have to be explicitly put in Imports, but rlang is good too! (having a dep on stringi is "free" since stringr is already there). Although I really like the inline markup features of cli.
See a PR I made for styler.

Edit: See https://rlang.r-lib.org/reference/topic-condition-formatting.html?q=abort#enabling-cli-formatting-globally for a solution to avoid explicitly importing cli.

I definitely think we should remove the superseded function use, if reasonable. We will need to be careful of functional changes when switching to the new functions. janitor is downloaded >100k times/month, so breaking changes should be minimized at (almost) all costs.

For sure! It may be when other packages extend janitor and try to apply functions to an object of a different class. Maybe we should wait until the next release for that, to see if #547 caused breakages.

FWIW, when I tried switching from dplyr::rename_all() to dplyr::rename_with() for the tidygraph method for clean_names(), it didn't work, so I ended up not changing this one, because I think it didn't have a rename_with() method for the class

@sfirke
Copy link
Owner

sfirke commented Jun 14, 2023

This all sounds great to me! I appreciate the thoughtful contributions and the way you are going about it (discussion first) 🙏

I'd defer to you two on cli vs. rlang though if you both like the cli result better, maybe that's a reason to go in that direction?

Agreed re: swapping out gather and spread in a non-breaking way. If doing it in the most straightforward way would result in breaking changes, let's discuss. In particular if you are talking about changes that result in objects of a different class, there might be room for that between versions if it made sense. There are few issues open like that, #301 and tidyverse/dplyr#6689 which has the steps to extend dplyr to so that its verbs work on tabyls.

Only other thought is let's go one PR at a time for these to break things into manageable chunks.

@olivroy
Copy link
Contributor Author

olivroy commented Jun 14, 2023

I'd defer to you two on cli vs. rlang though if you both like the cli result better, maybe that's a reason to go in that direction?

@billdenney let me know what you think
Since dplyr already imports cli, it wouldn't add a dependency to janitor (although it would have to be explicitly put in Imports, but rlang is good too! (having a dep on stringi is "free" since stringr is already there). Although I really like the inline markup features of cli.
See a PR I made for styler.

Edit: See https://rlang.r-lib.org/reference/topic-condition-formatting.html?q=abort#enabling-cli-formatting-globally for a solution to avoid explicitly importing cli.

Only other thought is let's go one PR at a time for these to break things into manageable chunks.

Yes, of course. I will start once the styling PR (#548) is merged! Thanks btw for adding a GitHub action to ensure consistent styling

@billdenney
Copy link
Collaborator

I think that cli is somewhat better, so let's go for that.

I think that we are ready to merge the styling PR, but I will defer to @sfirke for the final merge.

@sfirke
Copy link
Owner

sfirke commented Jun 14, 2023

#548 is merged! I'll be curious to see if the auto-styler runs correctly on the next PR. Maybe leave a trailing space or something to provoke it 😉

@olivroy
Copy link
Contributor Author

olivroy commented Jun 15, 2023

So as mentioned in #551, the style action failed when I added a commit that was not correctly styled. (since I don't have write access. But that if some code is left "unstyled" at the end of a PR, I assume that when you merge in the main branch, it adds a commit to restyle automatically (which is what is intended,

It does send a notification !
image

Edit there is possibly this

Config/autostyle/scope: line_breaks
Config/autostyle/strict: true

@olivroy
Copy link
Contributor Author

olivroy commented Jun 15, 2023

Potential things that could be done too!

  • Delete index.Rmd / index.md
    • it is a lot like README, and it seems outdated compared to README. (i.e., it contains Travis Build information). It doesn't seem useful to maintain since pkgdown uses README if index.md is not found. pkgdown doc
    • I checked the diff locally, and from my point of view, it doesn't seem worth it to keep it. You can check it yourself.
library(diffviewer) # A r-lib package 
# Will open an html widget in RStudio to visualize the diff between index and README
# You will see the only additional things in index.md seem to be outdated.
diffviewer::visual_diff(
  file_old = "index.md",
  file_new = "README.md",
  height = Inf,
  width = "100%"
)
  • (Optional) Maybe dirty_data.xlsx could be moved (or duplicated to avoid breaking changes, so that it is installed in the package. See https://r-pkgs.org/data.html#sec-data-extdata
    • With that someone could access janitor's dirty_data with read_excel(system.file("dirty_data.xlsx", "extdata", package = "janitor"))
    • It would increase slightly the install size (+13.2kb)

Thanks for letting me experiment a little! I like your package, it is used, and not too complex. Don't feel pressured to answer right away, and let me know if it is too much!

@billdenney
Copy link
Collaborator

@olivroy, I really appreciate the "let me know if it is too much"! 😄 I don't think it's too much, and you're helping with some beneficial spring cleaning which many packages could benefit from.

For index.Rmd/README, if we can maintain things in one place, I think that is helpful. So, I'd go for it as long as all the appropriate information is saved from both. (@sfirke, you have shouldered the load of this type of documentation-- is there anything unique remaining in index.Rmd?)

I'm neutral for the inclusion of dirty_data.xlsx.

@olivroy
Copy link
Contributor Author

olivroy commented Dec 7, 2023

Thanks for your help on this ! Truly appreciated

@olivroy olivroy closed this as completed Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants