-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Implement slicing APIs for ArraySegment<T> #9926
Conversation
// another generic type in addition to ArraySegment<T> for new type parameters. | ||
private static readonly T[] s_emptyArray = new T[0]; | ||
|
||
public static ArraySegment<T> Empty { get; } = new ArraySegment<T>(s_emptyArray); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it would be better for perf to do new ArraySegment<T>(new T[0])
here and then return Empty._array
in ToArray()
to avoid another static field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single static field would be slightly better, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single static field would be slightly better, I think.
K, will remove s_emptyArray
then (assuming by 'single static field' you mean only the Empty
property).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't s_emptyArray
be replaced by Array.Empty<T>
?
@benaadams FYI |
@@ -106,12 +112,11 @@ public int Count | |||
} | |||
} | |||
|
|||
public ref T this[int index] => ref _array[_offset + index]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this indexer will have poor performance characteristics on reference types for both reading and writing because of array covariance (it will have array covariance check for both). @KrzysztofCwalina Have you considered this during the API review?
I think this should be the classic get/set; forget about ref indexer here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas I suggested it back in https://github.com/dotnet/corefx/issues/10528#issuecomment-265625210 and @KrzysztofCwalina approved of it. I thought it would be more flexible since you can do everything with a ref T
(and more) that you can do with a regular indexer. If you say it is slower, however, I would be OK with having the latter.
@jkotas Thanks for reviewing. I've updated the PR in response to your comments. |
Please hold on merging until I make corefx tests, actually. Thanks. |
if (_array == null) | ||
ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_NullArray); | ||
Contract.EndContractBlock(); | ||
get { return _array[_offset + index]; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas, came up with a question while writing the corefx tests. Do you think argument validation should be added here to prevent people from accessing items that aren't within range of the segment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would think that the index should be validated here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas I am surprised, I thought you would have said no. I think it could discourage people from using the new API since ArraySegment<T>
is mostly used in perf-sensitive places. However, I suppose if we throw an exception here we could always decide to allow it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any example where ArraySegment is used in perf-sensitive place?
ArraySegment<T>
is in the same category as List<T>
in my mind - pretty convenient and lightweight, but still adds some overhead. I do not think that the perf-sensitive algorithms should be written against ArraySegment<T>
. They should be written directly against array (or potentially Span<T>
in future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas Well, a quick GitHub search through Kestrel shows a lot of usages... https://github.com/aspnet/KestrelHttpServer/search?utf8=%E2%9C%93&q=ArraySegment. Here is one place where this API could make code more readable. I don't think we want people manually writing as.Array[as.Offset + index]
to get better perf, because would defeat the whole point of this API, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is one place where this API could make code more readable.
Agree. However, this example is not performance critical. The extra argument validation is not a problem.
I don't think we want people manually writing as.Array[as.Offset + index] to get better perf
If you want to squeeze every last cycle out, you will probably write even something more convoluted. And that's perfectly fine. As I have said, the perf-sensitive algorithms should not be written against ArraySegment<T>
.
I do not see a good reason why the ArraySegment indexer should deviate from framework design guidelines and ship argument validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas OK, not disagreeing. If someone finds the extra overhead to be a problem maybe they can submit a PR here to use Unsafe
to circumvent the extra check. Or failing that, we can always decide to allow negative indices later.
@@ -145,6 +167,34 @@ public bool Equals(ArraySegment<T> obj) | |||
return obj._array == _array && obj._offset == _offset && obj._count == _count; | |||
} | |||
|
|||
public ArraySegment<T> Slice(int index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C# team is considering adding slicing syntax. It's not yet clear what would libraries have to do to support it. It could be Slice(int) method (just like here) or Slice(Range) method.
When this C# syntax ships, it would be great if ArraySegment supported it.
@jaredpar, @MadsTorgersen, any word on the syntax? Shoudl we just assume the Range solution? Should we add Range struct to the platform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KrzysztofCwalina Is this blocking this PR? In the case they decide to add a Range
struct we can just add another overload in addition to these two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's blocking.
@jkotas Responded to your feedback. You can merge if you think this is good to go. |
{ | ||
ThrowInvalidOperationIfDefault(); | ||
destination.ThrowInvalidOperationIfDefault(); | ||
System.Array.Copy(_array, _offset, destination._array, destination._offset, _count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can write beyond limits of the destination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas Good catch. Fixed
public ArraySegment<T> Slice(int index) | ||
{ | ||
ThrowInvalidOperationIfDefault(); | ||
// Note: `index` is allowed to be negative so the start of this ArraySegment can be moved backwards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the negative index for slicing discussed anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KrzysztofCwalina Do you have an opinion for allowing negative indices for slicing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@terrajobst Do you have an opinion on this either? (This is blocking the APIs from being added.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas @KrzysztofCwalina how well do negative bounds work for our optimized bounds checking pattern?
if ((uint)index >= (uint)_count) ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index);
I think we want to make sure indexing is blazing fast, so I don't think we want to allow negative indices as it would cause additional branches. In slicing it might be OK performance wise, but would feel inconsistent if I can't do it for regular indexing.
It's also a pattern that doesn't exist in the BCL. I don't think we should introduce it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to make sure indexing is blazing fast
Plain array is blazing fast. ArraySegment is not. It has extra overhead by design.
It's also a pattern that doesn't exist in the BCL. I don't think we should introduce it here.
Agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated PR to disallow negative indices.
@@ -106,12 +110,31 @@ public int Count | |||
} | |||
} | |||
|
|||
public Enumerator GetEnumerator() | |||
public T this[int index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we already discussed the potential for this returning ref T
instead of T
? e.g.
public ref T this[int index]
{
get
{
if ((uint)index >= (uint)_count) ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index);
return ref _array[_offset + index];
}
}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
This is why ref returns were invented. Definitely should be used here because there are no compat issues to consider on this new API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref T
returning indexer will not work well with array covariance. The rest of the ArraySegment is compatible with array covariance, so it will look odd.
String[] arr = new String[] { ... };
ArraySegment<object> segment = new ArraySegment(arr);
Object elem = segment[0]; // Throws exception if the indexer returns `ref T`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arrays.....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
Progress towards https://github.com/dotnet/corefx/issues/10528.
I let
Slice
accept negative indices so the start of theArraySegment<T>
can backtrack to an earlier position in the array. Please let me know if you think this is a good idea or not (checking for it would involve some additional overhead)./cc @jkotas, @KrzysztofCwalina