-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using tabyl with index-numbers #317
Comments
suppressPackageStartupMessages(require(tidyverse))
suppressPackageStartupMessages(require(janitor))
mtcars %>%
pivot_longer(-gear) %>%
group_split(name) %>%
map(function(x) {
varname <- x %>% distinct(name) %>% pull(name)
x %>%
rename(!!varname := value) %>%
tabyl(!!sym(varname), gear)
})
#> [[1]]
#> am 3 4 5
#> 0 15 4 0
#> 1 0 8 5
#>
#> [[2]]
#> carb 3 4 5
#> 1 3 4 0
#> 2 4 4 2
#> 3 3 0 0
#> 4 5 4 1
#> 6 0 0 1
#> 8 0 0 1
#>
#> [[3]]
#> cyl 3 4 5
#> 4 1 8 2
#> 6 2 4 1
#> 8 12 0 2
#>
#> [[4]]
#> disp 3 4 5
#> 71.1 0 1 0
#> 75.7 0 1 0
#> 78.7 0 1 0
#> 79.0 0 1 0
#> 95.1 0 0 1
#> 108.0 0 1 0
#> 120.1 1 0 0
#> 120.3 0 0 1
#> 121.0 0 1 0
#> 140.8 0 1 0
#> 145.0 0 0 1
#> 146.7 0 1 0
#> 160.0 0 2 0
#> 167.6 0 2 0
#> 225.0 1 0 0
#> 258.0 1 0 0
#> 275.8 3 0 0
#> 301.0 0 0 1
#> 304.0 1 0 0
#> 318.0 1 0 0
#> 350.0 1 0 0
#> 351.0 0 0 1
#> 360.0 2 0 0
#> 400.0 1 0 0
#> 440.0 1 0 0
#> 460.0 1 0 0
#> 472.0 1 0 0
#>
#> [[5]]
#> drat 3 4 5
#> 2.76 2 0 0
#> 2.93 1 0 0
#> 3.00 1 0 0
#> 3.07 3 0 0
#> 3.08 2 0 0
#> 3.15 2 0 0
#> 3.21 1 0 0
#> 3.23 1 0 0
#> 3.54 0 0 1
#> 3.62 0 0 1
#> 3.69 0 1 0
#> 3.70 1 0 0
#> 3.73 1 0 0
#> 3.77 0 0 1
#> 3.85 0 1 0
#> 3.90 0 2 0
#> 3.92 0 3 0
#> 4.08 0 2 0
#> 4.11 0 1 0
#> 4.22 0 1 1
#> 4.43 0 0 1
#> 4.93 0 1 0
#>
#> [[6]]
#> hp 3 4 5
#> 52 0 1 0
#> 62 0 1 0
#> 65 0 1 0
#> 66 0 2 0
#> 91 0 0 1
#> 93 0 1 0
#> 95 0 1 0
#> 97 1 0 0
#> 105 1 0 0
#> 109 0 1 0
#> 110 1 2 0
#> 113 0 0 1
#> 123 0 2 0
#> 150 2 0 0
#> 175 2 0 1
#> 180 3 0 0
#> 205 1 0 0
#> 215 1 0 0
#> 230 1 0 0
#> 245 2 0 0
#> 264 0 0 1
#> 335 0 0 1
#>
#> [[7]]
#> mpg 3 4 5
#> 10.4 2 0 0
#> 13.3 1 0 0
#> 14.3 1 0 0
#> 14.7 1 0 0
#> 15.0 0 0 1
#> 15.2 2 0 0
#> 15.5 1 0 0
#> 15.8 0 0 1
#> 16.4 1 0 0
#> 17.3 1 0 0
#> 17.8 0 1 0
#> 18.1 1 0 0
#> 18.7 1 0 0
#> 19.2 1 1 0
#> 19.7 0 0 1
#> 21.0 0 2 0
#> 21.4 1 1 0
#> 21.5 1 0 0
#> 22.8 0 2 0
#> 24.4 0 1 0
#> 26.0 0 0 1
#> 27.3 0 1 0
#> 30.4 0 1 1
#> 32.4 0 1 0
#> 33.9 0 1 0
#>
#> [[8]]
#> qsec 3 4 5
#> 14.50 0 0 1
#> 14.60 0 0 1
#> 15.41 1 0 0
#> 15.50 0 0 1
#> 15.84 1 0 0
#> 16.46 0 1 0
#> 16.70 0 0 1
#> 16.87 1 0 0
#> 16.90 0 0 1
#> 17.02 1 1 0
#> 17.05 1 0 0
#> 17.30 1 0 0
#> 17.40 1 0 0
#> 17.42 1 0 0
#> 17.60 1 0 0
#> 17.82 1 0 0
#> 17.98 1 0 0
#> 18.00 1 0 0
#> 18.30 0 1 0
#> 18.52 0 1 0
#> 18.60 0 1 0
#> 18.61 0 1 0
#> 18.90 0 2 0
#> 19.44 1 0 0
#> 19.47 0 1 0
#> 19.90 0 1 0
#> 20.00 0 1 0
#> 20.01 1 0 0
#> 20.22 1 0 0
#> 22.90 0 1 0
#>
#> [[9]]
#> vs 3 4 5
#> 0 12 2 4
#> 1 3 10 1
#>
#> [[10]]
#> wt 3 4 5
#> 1.513 0 0 1
#> 1.615 0 1 0
#> 1.835 0 1 0
#> 1.935 0 1 0
#> 2.140 0 0 1
#> 2.200 0 1 0
#> 2.320 0 1 0
#> 2.465 1 0 0
#> 2.620 0 1 0
#> 2.770 0 0 1
#> 2.780 0 1 0
#> 2.875 0 1 0
#> 3.150 0 1 0
#> 3.170 0 0 1
#> 3.190 0 1 0
#> 3.215 1 0 0
#> 3.435 1 0 0
#> 3.440 1 2 0
#> 3.460 1 0 0
#> 3.520 1 0 0
#> 3.570 1 0 1
#> 3.730 1 0 0
#> 3.780 1 0 0
#> 3.840 1 0 0
#> 3.845 1 0 0
#> 4.070 1 0 0
#> 5.250 1 0 0
#> 5.345 1 0 0
#> 5.424 1 0 0 Created on 2019-11-25 by the reprex package (v0.3.0) |
Thanks jzadra, I would like to understand your code better to improve my own skills: from what I gathered, you transform the data from wide- to long-format and then split the df into list-elements each containing "gear" plus the variable to be tabulated with and the corresponding values per variable. And as an addendum: Would you be so kind to comment shortly after each line what your rationale for it is? What's the goal of your transformation? |
Certainly! Yes, going to long does go against the tidy data standard, but only for a second - because what we are splitting the long format in the next step into separate tables that are then back to tidy data. Going wide to long is useful on many occasions when you want to functionalize something (i.e. using map) or you want to vectorize some operation on multiple columns. Often you then return to wide afterwords. The reason for not using indices in my opinion is that they are incredibly prone to breakage. If you change something in an earlier processing step that adds a column or a row, all of a sudden your indices aren't referring to what they used to refer to. Referring to things by name on the other hand it doesn't matter what location it moves to. My general goal below is to make it so that I don't have to type the names of the all the columns we want to compare with
Hope that helps! |
Thank you @jzadra for providing this thoughtful, detailed response! 🙌 |
@jzadra : wonderful!!! Appreciate this a lot! |
Hi there! I found this extremely helpful - I essentially replicated the code above for my own dataset (switching the order of variables in the table slightly) and it worked once but since I tried to clean it up slightly, I've repeatedly generated this error: Error in Any help would be hugely appreciated (@jzadra I wonder whether you have any quick thoughts?)! Best, PS New to this forum so apologies in advance if I've missed some of the conventions.
|
I can't say for sure but it looks like |
Thank you very much for getting back to me @sfirke! I've tried this but unfortunately I then get the message, " Error in |
It might be from the rlang package? I'm not really sure how to help further
since it's not a janitor-related issue, sorry.
You might try posting a question on StackOverflow for help.
Sam
…On Thu, Jun 24, 2021, 7:19 PM RachelK1994 ***@***.***> wrote:
Thank you very much for getting back to me @sfirke
<https://github.com/sfirke>! I've tried this but unfortunately I then get
the message, " Error in :=(!!varname, value) : could not find function
":="" - is it just the "dplyr" package I need to read in or is there
perhaps another package I'm missing (or perhaps I've loaded another
redundant package that is causing the issue)? Best,Rachel
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#317 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZYDEBVJD52PSEQUJTDOHTTUO4PFANCNFSM4JRMHA3Q>
.
|
Yeah I'm pretty sure these require rlang.
Thanks,
Jon
…--
Jonathan Zadra, PhD (he/him)
Director, Data Science
Sorenson Impact Center
David Eccles School of Business, University of Utah
www.sorensonimpact.com
(801) 581-4815
On Jun 24, 2021, 18:28 -0600, Sam Firke ***@***.***>, wrote:
It might be from the rlang package? I'm not really sure how to help further
since it's not a janitor-related issue, sorry.
You might try posting a question on StackOverflow for help.
Sam
On Thu, Jun 24, 2021, 7:19 PM RachelK1994 ***@***.***> wrote:
> Thank you very much for getting back to me @sfirke
> <https://github.com/sfirke>! I've tried this but unfortunately I then get
> the message, " Error in :=(!!varname, value) : could not find function
> ":="" - is it just the "dplyr" package I need to read in or is there
> perhaps another package I'm missing (or perhaps I've loaded another
> redundant package that is causing the issue)? Best,Rachel
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#317 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABZYDEBVJD52PSEQUJTDOHTTUO4PFANCNFSM4JRMHA3Q>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
I have spent days trying to figure this out! Your code worked for me, thank you!! |
Feature requests
I would like to use tabyl for recurring procedures with a map()-function, looping over several variables in a data frame without naming them, but rather just have the loop access the next variable in line. It is possible with table(), but I don't like table()'s properties and rather do it with tabyl().
Here is an example with table():
With tabyl I would have to specify each tabulation, correct?
That seems tedious to me and I would prefer a looping-solution.
The text was updated successfully, but these errors were encountered: