-
Notifications
You must be signed in to change notification settings - Fork 981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new with optimization to allow avoid [ overhead #4488
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4488 +/- ##
==========================================
- Coverage 99.60% 99.48% -0.12%
==========================================
Files 72 73 +1
Lines 13918 13988 +70
==========================================
+ Hits 13863 13916 +53
- Misses 55 72 +17
Continue to review full report at Codecov.
|
nc = length(x) | ||
if (isFALSE(w[["i"]]) && !missing(i)) i = with_i(i, len=nr, verbose=verbose) | ||
if (isFALSE(w[["j"]]) && !missing(j)) j = with_j(j, len=nc, x=x, verbose=verbose) | ||
if ((isFALSE(w[["i"]]) && missing(j)) || (isFALSE(w[["j"]]) && missing(i)) || (isFALSE(w[["i"]]) && isFALSE(w[["j"]]))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit hard to figure out what this branch is for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from body of this branch one can see that branch is for early return and escape rest of [ processing
basic API looks good. Do you know if |
if it is
The "faster" is not that much relevant here, I would call it low overhead, which eventually can be faster when looping many times. Otherwise speed difference is insignificant. |
OK. Still if "low-overhead-ness" differs between |
|
Most options are lists, including discussion of For some parts, maybe we could include |
@ColeMiller1 to make this NSE (so the argument can be evaluated in separate line and result passed to |
This comment has been minimized.
This comment has been minimized.
if (isFALSE(w[["i"]]) && !missing(i)) i = with_i(i, len=nr, verbose=verbose) | ||
if (isFALSE(w[["j"]]) && !missing(j)) j = with_j(j, len=nc, x=x, verbose=verbose) | ||
if ((isFALSE(w[["i"]]) && missing(j)) || (isFALSE(w[["j"]]) && missing(i)) || (isFALSE(w[["i"]]) && isFALSE(w[["j"]]))) { | ||
if (missing(i)) i = seq_len(nr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seq_len(nr)
is only needed on the which
logic branch. Otherwise, we could do i = NULL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but eventually subsetDT could not copy when NULL provided, although it does copy now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main point being that if missing(i)
, this would realize an integer vector seq_len(nr)
when it is unnecessary except for the which()
branch.
Regarding allowing shallow copies in CsubsetDT
, that would be a nice feature. A data.frame does not appear to make a copy when selecting columns. At least that's what memory profiling using bench::mark
suggests.
closes #3735 and #4485
from 27s down to 5s, together with #4484
and 13s down to 5s for this PR alone
Once the general idea and API will be approved, the following items should be added:
with
andj
processing in downstream code in[.data.table