Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String coercion inserts spaces, as.matrix.data.table #4209

Open
alistaire47 opened this issue Jan 29, 2020 · 3 comments · May be fixed by #4144
Open

String coercion inserts spaces, as.matrix.data.table #4209

alistaire47 opened this issue Jan 29, 2020 · 3 comments · May be fixed by #4144
Assignees

Comments

@alistaire47
Copy link

When coercing from a data.table to a matrix and types are coerced (yes this is bad and shouldn't be done, so this is definitely low-priority), a data.table inserts spaces whereas a data.frame does not:

> as.matrix(data.frame(a = LETTERS[1:2], b = c(FALSE, TRUE)))
     a   b      
[1,] "A" "FALSE"
[2,] "B" "TRUE" 
> as.matrix(data.table(a = LETTERS[1:2], b = c(FALSE, TRUE)))
     a   b      
[1,] "A" "FALSE"
[2,] "B" " TRUE"

I assume this is a result of C++ typing somehow, but I don't know enough to make a PR, sorry.

@MichaelChirico
Copy link
Member

as.matrix.data.frame uses as.character for logical columns, whereas as.matrix.data.table uses format -- format adds the space.

@sritchie73 is this covered by your ongoing #4144? I was unable to install & test myself for the moment

@MichaelChirico
Copy link
Member

In fact, it seems this is something that was fixed quite recently in base R itself!

wch/r-source@bd3b7ef

So I guess our issue is a vestige of the same logic being copied over from as.matrix.data.frame many years ago

@sritchie73
Copy link
Contributor

sritchie73 commented Jan 30, 2020

This is fixed in #4144 , this is the output I see on my working test branch:

     a   b      
[1,] "A" "FALSE"
[2,] "B" "TRUE" 

The current version of data.table on CRAN simply uses base::unlist(), while #4144 is implemented in C and uses our internal memrecycle function which has the correct desired behaviour:

> dt = data.table(a = LETTERS[1:2], b = c(FALSE, TRUE))
> rbind(dt[,.(a)], dt[,.(b)], use.names=FALSE)[,a] # rbindlist uses memrecycle for type coercion
[1] "A"     "B"     "FALSE" "TRUE"

@sritchie73 sritchie73 linked a pull request Jan 30, 2020 that will close this issue
8 tasks
@jangorecki jangorecki changed the title String coercion inserts spaces String coercion inserts spaces, as.matrix.data.table Apr 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants