Suggestion: get_dupes results should be sorted by dupe_count descending #493

sfirke · 2022-11-16T16:46:08Z

Feature requests

Sort the results of get_dupes by dupe_count.

Remarks

Right now it sorts alphabetically by the grouping variable. But usually I then add %>% arrange(desc(dupe_count)) to start with the most-duplicated combination. I suppose this would potentially be a breaking change for some use cases. But I think it is ultimately more useful than the alphabetical sort.

Example

mtcars %>% get_dupes(cyl)

In my result, the group with cyl = 8 that has 14 records would appear first.

This would be easy to implement. Thoughts from others?

The text was updated successfully, but these errors were encountered:

billdenney · 2022-11-16T17:03:43Z

It makes sense to me. When I use get_dupes(), the goal is either to confirm that there are no duplicates or to manually inspect them. Neither of my use cases is broken with your suggested change, and the manual inspection is improved.

My one addition would be to break a tie by sorting alphabetically second so that there is a consistent output order.

jzadra · 2022-11-16T21:02:32Z

Yes I think this is a good enhancement.

fixes #493

sfirke added the seeking comments Users and any interested parties should please weigh in - this is in a discussion phase! label Nov 16, 2022

JasonAizkalns added a commit to JasonAizkalns/janitor that referenced this issue Dec 1, 2022

Create allow_dupes param, resolves sfirke#493

203467a

sfirke mentioned this issue Jan 12, 2023

get_dupes sorts first on desc(dupe_count) #511

Merged

sfirke closed this as completed in #511 Jan 12, 2023

sfirke added a commit that referenced this issue Jan 12, 2023

get_dupes sorts first on desc(dupe_count) (#511)

d528ec9

fixes #493

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestion: get_dupes results should be sorted by dupe_count descending #493

Suggestion: get_dupes results should be sorted by dupe_count descending #493

sfirke commented Nov 16, 2022

billdenney commented Nov 16, 2022

jzadra commented Nov 16, 2022

Suggestion: get_dupes results should be sorted by dupe_count descending #493

Suggestion: get_dupes results should be sorted by dupe_count descending #493

Comments

sfirke commented Nov 16, 2022

Feature requests

Remarks

Example

billdenney commented Nov 16, 2022

jzadra commented Nov 16, 2022