Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Implement slicing APIs for ArraySegment<T> #9926

Merged
merged 8 commits into from
Apr 10, 2017
Merged

Conversation

jamesqo
Copy link

@jamesqo jamesqo commented Mar 3, 2017

Progress towards https://github.com/dotnet/corefx/issues/10528.

I let Slice accept negative indices so the start of the ArraySegment<T> can backtrack to an earlier position in the array. Please let me know if you think this is a good idea or not (checking for it would involve some additional overhead).

/cc @jkotas, @KrzysztofCwalina

// another generic type in addition to ArraySegment<T> for new type parameters.
private static readonly T[] s_emptyArray = new T[0];

public static ArraySegment<T> Empty { get; } = new ArraySegment<T>(s_emptyArray);
Copy link
Author

@jamesqo jamesqo Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it would be better for perf to do new ArraySegment<T>(new T[0]) here and then return Empty._array in ToArray() to avoid another static field?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single static field would be slightly better, I think.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single static field would be slightly better, I think.

K, will remove s_emptyArray then (assuming by 'single static field' you mean only the Empty property).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't s_emptyArray be replaced by Array.Empty<T>?

@jamesqo
Copy link
Author

jamesqo commented Mar 3, 2017

@benaadams FYI

@@ -106,12 +112,11 @@ public int Count
}
}

public ref T this[int index] => ref _array[_offset + index];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this indexer will have poor performance characteristics on reference types for both reading and writing because of array covariance (it will have array covariance check for both). @KrzysztofCwalina Have you considered this during the API review?

I think this should be the classic get/set; forget about ref indexer here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas I suggested it back in https://github.com/dotnet/corefx/issues/10528#issuecomment-265625210 and @KrzysztofCwalina approved of it. I thought it would be more flexible since you can do everything with a ref T (and more) that you can do with a regular indexer. If you say it is slower, however, I would be OK with having the latter.

@jamesqo
Copy link
Author

jamesqo commented Mar 4, 2017

@jkotas Thanks for reviewing. I've updated the PR in response to your comments.

@jamesqo
Copy link
Author

jamesqo commented Mar 4, 2017

Please hold on merging until I make corefx tests, actually. Thanks.

if (_array == null)
ThrowHelper.ThrowInvalidOperationException(ExceptionResource.InvalidOperation_NullArray);
Contract.EndContractBlock();
get { return _array[_offset + index]; }
Copy link
Author

@jamesqo jamesqo Mar 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas, came up with a question while writing the corefx tests. Do you think argument validation should be added here to prevent people from accessing items that aren't within range of the segment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would think that the index should be validated here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas I am surprised, I thought you would have said no. I think it could discourage people from using the new API since ArraySegment<T> is mostly used in perf-sensitive places. However, I suppose if we throw an exception here we could always decide to allow it in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any example where ArraySegment is used in perf-sensitive place?

ArraySegment<T> is in the same category as List<T> in my mind - pretty convenient and lightweight, but still adds some overhead. I do not think that the perf-sensitive algorithms should be written against ArraySegment<T>. They should be written directly against array (or potentially Span<T> in future).

Copy link
Author

@jamesqo jamesqo Mar 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas Well, a quick GitHub search through Kestrel shows a lot of usages... https://github.com/aspnet/KestrelHttpServer/search?utf8=%E2%9C%93&q=ArraySegment. Here is one place where this API could make code more readable. I don't think we want people manually writing as.Array[as.Offset + index] to get better perf, because would defeat the whole point of this API, right?

Copy link
Member

@jkotas jkotas Mar 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is one place where this API could make code more readable.

Agree. However, this example is not performance critical. The extra argument validation is not a problem.

I don't think we want people manually writing as.Array[as.Offset + index] to get better perf

If you want to squeeze every last cycle out, you will probably write even something more convoluted. And that's perfectly fine. As I have said, the perf-sensitive algorithms should not be written against ArraySegment<T>.

I do not see a good reason why the ArraySegment indexer should deviate from framework design guidelines and ship argument validation.

Copy link
Author

@jamesqo jamesqo Mar 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas OK, not disagreeing. If someone finds the extra overhead to be a problem maybe they can submit a PR here to use Unsafe to circumvent the extra check. Or failing that, we can always decide to allow negative indices later.

@@ -145,6 +167,34 @@ public bool Equals(ArraySegment<T> obj)
return obj._array == _array && obj._offset == _offset && obj._count == _count;
}

public ArraySegment<T> Slice(int index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# team is considering adding slicing syntax. It's not yet clear what would libraries have to do to support it. It could be Slice(int) method (just like here) or Slice(Range) method.

When this C# syntax ships, it would be great if ArraySegment supported it.

@jaredpar, @MadsTorgersen, any word on the syntax? Shoudl we just assume the Range solution? Should we add Range struct to the platform?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KrzysztofCwalina Is this blocking this PR? In the case they decide to add a Range struct we can just add another overload in addition to these two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's blocking.

@jamesqo
Copy link
Author

jamesqo commented Mar 13, 2017

@jkotas Responded to your feedback. You can merge if you think this is good to go.

{
ThrowInvalidOperationIfDefault();
destination.ThrowInvalidOperationIfDefault();
System.Array.Copy(_array, _offset, destination._array, destination._offset, _count);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can write beyond limits of the destination

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas Good catch. Fixed

public ArraySegment<T> Slice(int index)
{
ThrowInvalidOperationIfDefault();
// Note: `index` is allowed to be negative so the start of this ArraySegment can be moved backwards.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the negative index for slicing discussed anywhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KrzysztofCwalina Do you have an opinion for allowing negative indices for slicing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terrajobst Do you have an opinion on this either? (This is blocking the APIs from being added.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas @KrzysztofCwalina how well do negative bounds work for our optimized bounds checking pattern?

if ((uint)index >= (uint)_count) ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index);

I think we want to make sure indexing is blazing fast, so I don't think we want to allow negative indices as it would cause additional branches. In slicing it might be OK performance wise, but would feel inconsistent if I can't do it for regular indexing.

It's also a pattern that doesn't exist in the BCL. I don't think we should introduce it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to make sure indexing is blazing fast

Plain array is blazing fast. ArraySegment is not. It has extra overhead by design.

It's also a pattern that doesn't exist in the BCL. I don't think we should introduce it here.

Agree.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated PR to disallow negative indices.

@@ -106,12 +110,31 @@ public int Count
}
}

public Enumerator GetEnumerator()
public T this[int index]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we already discussed the potential for this returning ref T instead of T? e.g.

public ref T this[int index]
{
    get
    {
        if ((uint)index >= (uint)_count) ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.index);
        return ref _array[_offset + index];
    }
}

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

This is why ref returns were invented. Definitely should be used here because there are no compat issues to consider on this new API.

Copy link
Member

@jkotas jkotas Apr 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref T returning indexer will not work well with array covariance. The rest of the ArraySegment is compatible with array covariance, so it will look odd.

String[] arr = new String[] { ... };
ArraySegment<object> segment = new ArraySegment(arr);
Object elem = segment[0]; // Throws exception if the indexer returns `ref T`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrays.....

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@jkotas jkotas merged commit 40ecbb7 into dotnet:master Apr 10, 2017
@jamesqo jamesqo deleted the aseg.slice branch April 10, 2017 17:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.