Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "slicing" APIs to ArraySegment<T> #18007

Closed
KrzysztofCwalina opened this issue Aug 2, 2016 · 60 comments
Closed

Add "slicing" APIs to ArraySegment<T> #18007

KrzysztofCwalina opened this issue Aug 2, 2016 · 60 comments
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Collections help wanted [up-for-grabs] Good issue for external contributors wishlist Issue we would like to prioritize, but we can't commit we will get to it yet
Milestone

Comments

@KrzysztofCwalina
Copy link
Member

ArraySegment could be much more useful if it had the following additional APIs. I propose that we add these members to ArraySegment:

namespace System
{
    partial struct ArraySegment<T>
    {
        public T this[int index] { get; set;  }

        public static implicit operator ArraySegment<T>(T[] array);

        public static explicit operator T[](ArraySegment<T> segment);

        public T[] ToArray();

        public int Length { get; }

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(T[] destination);

        public void CopyTo(T[] destination, int destinationIndex);

        public void CopyTo(ArraySegment<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}
@omariom
Copy link
Contributor

omariom commented Aug 2, 2016

public static explicit operator T[](ArraySegment<T> segment);

Does it make a copy?

@KrzysztofCwalina
Copy link
Member Author

@omariom, yes. this is why I made it an explicit cast. Possibly it should be removed completely.

@terrajobst
Copy link
Member

terrajobst commented Aug 2, 2016

The nice about a conversion operatator is that the compiler will give you an error message like:

Cannot convert ArraySegment<Bananas> to @Bananas[]. An explicit conversion exists. Are you missing a cast?

Now, one could also argue that this is the problem if people just apply the cast without realizing that there is performance/memory penalty for doing so.

public T[] CreateArray();

I think we should call it ToArray(). That matches the name Linq has and thus shadows the Linq implementation with this one, which is likely being more performant.

Other than that, the proposals look good to me.

@omariom
Copy link
Contributor

omariom commented Aug 2, 2016

MemoryStream has ToArray as well.
Prefix To here is an indication that the method makes a copy of the underlying storage.

Good candidat for implicit operator is conversation to Span.
When Slices come to CoreCLR, of course.

@KrzysztofCwalina
Copy link
Member Author

Changed CreateArray to ToArray

@justinvp
Copy link
Contributor

justinvp commented Aug 3, 2016

Here are all of List<T>'s CopyTo overloads:

    public void CopyTo(T[] array);

    public void CopyTo(T[] array, int arrayIndex);

    public void CopyTo(int index, T[] array, int arrayIndex, int count);

Should ArraySegment<T> match?

In other words:

  1. Rename index in the proposal to arrayIndex (assuming index is the index of the destination array in which copying begins) to match the naming used by List<T>.CopyTo (and ICollection<T>.CopyTo).
  2. Add a source index parameter to the largest overload, to match List<T>.

@justinvp
Copy link
Contributor

justinvp commented Aug 3, 2016

Interestingly, ImmutableArray<T> uses different parameter names:

    public void CopyTo(T[] destination);

    public void CopyTo(T[] destination, int destinationIndex);

    public void CopyTo(int sourceIndex, T[] destination, int destinationIndex, int length);

Whereas ImmutableList<T> matches List<T>.

@mellinoe
Copy link
Contributor

mellinoe commented Aug 3, 2016

I like ImmutableArray's overloads and naming convention.

@KrzysztofCwalina
Copy link
Member Author

updated CopyTo to use "destination"

@svick
Copy link
Contributor

svick commented Aug 13, 2016

What is the point of adding Length, when Count already exists?

@davidfowl
Copy link
Member

davidfowl commented Aug 13, 2016

Are we going to add array segment to more APIs as well. Nothing takes array segment except for the web socket APIs

@KrzysztofCwalina
Copy link
Member Author

@davidfowl, yes. I think we should/must.

@karelz
Copy link
Member

karelz commented Sep 28, 2016

@KrzysztofCwalina is it ready for API review?

@terrajobst
Copy link
Member

terrajobst commented Oct 4, 2016

We should remove the explicit operator as it seems weird. And Empty should be static.

The indexer needs to perform a range check; it doesn't have to perform a null check because the range check can just use the fields (offset and count). The only case to observe a null reference is when the ArraySegment<T> was created by concurrent writes to a field, which we don't think warrants slowing down the indexer with additional checks.

Also, ToArray() should always copy, except when the array is empty or null, in these cases it should return Array.Empty<T>().

namespace System
{
    partial struct ArraySegment<T>
    {
        public T this[int index] { get; set;  }

        public static implicit operator ArraySegment<T>(T[] array);

        public T[] ToArray();

        public int Length { get; }

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(T[] destination);

        public void CopyTo(T[] destination, int destinationIndex);

        public void CopyTo(T[] destination, int destinationIndex, int count);

        public void CopyTo(ArraySegment<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}

@AlexRadch
Copy link
Contributor

AlexRadch commented Nov 28, 2016

Where new API should be implemented in corefx or coreclr repository?

@karelz
Copy link
Member

karelz commented Nov 28, 2016

The low-level APIs which already live in coreclr repo ('corlib') need to be done first there. All APIs should have tests and reference assemblies in corefx repo - it is tricky and not best documented process at this moment.
If you want to help with something simpler, I would suggest to start with corefx-only changes.

@AlexRadch
Copy link
Contributor

@karelz I made two simple PRs in corefx, but have issues with changing config files. Can you help me? dotnet/corefx#13975 and dotnet/corefx#13976

@karelz
Copy link
Member

karelz commented Nov 28, 2016

@AlexRadch the more reason to not attack coreclr+corefx changes which are more tricky.

I hope that @Priya91 and @ianhays will be able to help you with the two above. I myself still lack the experience in making those changes to be able to provide useful advice.

@Priya91
Copy link
Contributor

Priya91 commented Nov 28, 2016

@AlexRadch From your PR, seems like you're not adding the APIs proposed in this issue. I left feedback on one of your PRs. Please keep the conversation on related issues.

@AlexRadch
Copy link
Contributor

What difference between array slices, array spans and array segments ? I am confused :(

@karelz
Copy link
Member

karelz commented Dec 5, 2016

@Priya91 @AlexRadch is there PR for this issue? If yes, it should be linked.

Slices and spans are part of the new Slice<T> API. ArraySegment is the old API.

@Priya91
Copy link
Contributor

Priya91 commented Dec 5, 2016

is there PR for this issue?

I'm not aware of any..

@AlexRadch
Copy link
Contributor

AlexRadch commented Dec 6, 2016

@karelz Why add new methods to old API? May be better to mark ArraySegment as deprecated?

@jamesqo
Copy link
Contributor

jamesqo commented Dec 8, 2016

@AlexRadch ArraySegment will not be deprecated; there is nothing wrong with it. Span is just a new way of doing things. Using ArraySegment may be necessary if you want to get back the underlying array.

@terrajobst Regarding the proposal you have above

namespace System
{
    partial struct ArraySegment<T>
    {
        public T this[int index] { get; set;  }

        public static implicit operator ArraySegment<T>(T[] array);

        public T[] ToArray();

        public int Length { get; }

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(T[] destination);

        public void CopyTo(T[] destination, int destinationIndex);

        public void CopyTo(T[] destination, int destinationIndex, int count);

        public void CopyTo(ArraySegment<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}
  • Should the indexer return a ref T { get; } instead? This would reduce copying for structs-- see https://github.com/dotnet/coreclr/issues/1133.

  • Can we replace all of the CopyTo overloads with CopyTo(Span<T>)? I thought one of the main reasons for having Span was to eliminate array/offset/count methods.

  • While we're changing the APIs here, can we also have the type implement IEquatable<ArraySegment<T>> because it is a value type that already overrides Equals?

@jamesqo
Copy link
Contributor

jamesqo commented Dec 8, 2016

New proposed is

namespace System
{
    public struct ArraySegment<T> : IList<T>, IReadOnlyList<T>, IEquatable<ArraySegment<T>>
    {
        public ref T this[int index] { get; }

        public static implicit operator ArraySegment<T>(T[] array);

        public T[] ToArray();

        public int Length { get; }

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(Span<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}

@jamesqo
Copy link
Contributor

jamesqo commented Dec 8, 2016

@karelz Sorry, I just noticed this has already been marked api-approved... was the above spec the final decision, or can this be reviewed again?

@karelz
Copy link
Member

karelz commented Dec 8, 2016

There is always option to change ... changing APIs before we ship them in major release is "doable".

  1. I don't have opinion on public ref T this[int index] { get; } - @terrajobst is it a pattern we want to consider?
  2. I don't think we should replace CopyTo overloads with Span overload, because ArraySegment is about arrays and their segments -- so this is the place where it makes sense to add. cc: @KrzysztofCwalina @terrajobst any push back?
  3. I don't have opinion on IEquatable<ArraySegment<T>>. Do you have usage in mind? Or are you just applying common patterns?

@KrzysztofCwalina
Copy link
Member Author

KrzysztofCwalina commented Dec 8, 2016

re # 1: Span's indexer is now by ref (which was decided at an API review meeting), so we already approved this new pattern. There are trade offs here, but long term the wins are large, so I think we should go with it.
re # 2: The CopyTo parameter makes sense. Spans are about arrays too. The only thing I worry about is layering.
re # 3: I am not a big fan of O(n) Equals methods. Also, Span.Equals just compares the fields, not the items in the span.

@karelz
Copy link
Member

karelz commented Dec 12, 2016

Thanks @KrzysztofCwalina! That makes sense.

@AlexRadch your questions [1] and [2] are now answered -- we won't use Span<T>, we will go with the original proposal.
Regarding your [3], we should also stick to the original proposal. The original method makes perfect sense IMO -- you copy into array starting at index with specific count. The List.CopyTo overload tries to emulate segment of List IMO, which is not needed here - we already have 'segment' of the array.

@AlexRadch
Copy link
Contributor

AlexRadch commented Dec 12, 2016

@karelz ok about [1] and [2]
[3] segment have offset and count so if we remove index from List.CopyTo( int index, T[] array, int arrayIndex, int count) then why we do not remove count also.

But if we will remove offset and count than we already have such method public void CopyTo(T[] destination, int destinationIndex); I do not understand half removing.

So I suggest do not add public void CopyTo(T[] destination, int destinationIndex, int count); overload with half removing.

It can be easy coded as segment.Slice(newOffset, newCount).CopyTo(destination, destinationIndex).

@KrzysztofCwalina What do you think?

@karelz
Copy link
Member

karelz commented Dec 12, 2016

I could live without that overload CopyTo(T[] destination, int destinationIndex, int count). @KrzysztofCwalina?

@terrajobst
Copy link
Member

We just talked about it and concluded we should indeed remove this overload:

 public void CopyTo(T[] destination, int destinationIndex, int count);

We should leave the other CopyTo methods in, though.

@karelz
Copy link
Member

karelz commented Dec 13, 2016

@AlexRadch thanks for pointing it out. The API proposal in the top post is now updated. Feel free to implement it as defined in the top post.

@AlexRadch
Copy link
Contributor

AlexRadch commented Dec 13, 2016

@karelz I found 2 issues about Empty

  1. It should be static.
  2. I think we should add IsEmpty property like Span has.

@karelz
Copy link
Member

karelz commented Dec 13, 2016

Empty fixed to static (it was some copy-paste error, it was properly listed e.g. here: https://github.com/dotnet/corefx/issues/10528#issuecomment-251477256).

Let's track IsEmpty addition as separate issue, otherwise we will never converge on the API shape. Please file it, provide API shape and motivation (IMO "Span has it", is not good enough - I would like to see more details and usage patterns).

@AlexRadch
Copy link
Contributor

@karelz I created separate issue dotnet/corefx#14502

@karelz
Copy link
Member

karelz commented Jan 29, 2017

No activity/response for 1.5 months, unassigning the issue - it is "up for grabs" again, available for anyone to pick it up and finish it.
Anyone interested?

@danmoseley
Copy link
Member

@ianhays @safern I moved to Future as this API does not seem necessary for our 2.0 goals. Feel free to change if it is. You might do the same with other issues of this kind if any

@karelz
Copy link
Member

karelz commented Mar 1, 2017

@KrzysztofCwalina actually wanted this one to happen in 2.0 - he said he will do it himself if no one picks it up ...

@jamesqo
Copy link
Contributor

jamesqo commented Mar 1, 2017

@karelz Curious, when does 2.0 come out?

@karelz
Copy link
Member

karelz commented Mar 1, 2017

@jamesqo check the milestone description: https://github.com/dotnet/corefx/milestones
There's link to roadmap: https://github.com/dotnet/core/blob/master/roadmap.md -- currently set to Q3 2017.

@karelz
Copy link
Member

karelz commented Mar 2, 2017

@jamesqo if you want to pick up this issue, it would be appreciated ;-)

@jamesqo
Copy link
Contributor

jamesqo commented Mar 3, 2017

@karelz Sure, one thing though. It seems like from the previous discussion @KrzysztofCwalina decided in favor of having a ref T indexer, but the proposal was never updated?

@jamesqo
Copy link
Contributor

jamesqo commented Mar 3, 2017

@karelz, @terrajobst made a post further down here that had the latest proposed API at the time, so I think the current proposal looks like

namespace System
{
    partial struct ArraySegment<T>
    {
        public ref T this[int index] { get; }

        public static implicit operator ArraySegment<T>(T[] array);

        public T[] ToArray();

        public ArraySegment<T> Slice(int index, int count);

        public ArraySegment<T> Slice(int index);

        public void CopyTo(T[] destination);

        public void CopyTo(T[] destination, int destinationIndex);

        public void CopyTo(ArraySegment<T> destination);

        public static ArraySegment<T> Empty { get; }
    }
}

Correct?

@jamesqo
Copy link
Contributor

jamesqo commented Mar 3, 2017

@KrzysztofCwalina By the way, why are we having ArraySegment<T>.Length when we already have ArraySegment<T>.Count? Is that required for the compiler to recognize ArraySegment as a type that can be sliced when spans get language support? It seems really weird/confusing to have two properties that do the same thing.

@KrzysztofCwalina
Copy link
Member Author

KrzysztofCwalina commented Mar 3, 2017

The point was to make it easier to switch from array, span, etc. to arraysegment and vice versa. But I agree there is a negative aspect of the addition.

@jamesqo
Copy link
Contributor

jamesqo commented Mar 3, 2017

@KrzysztofCwalina OK. I have pushed it out of the proposal for now, then; it can be added later if you change your mind.

I have a question that came up while I was working on the implementation: should Slice accept negative indices? If you do new ArraySegment<T>(new T[1], 1, 0).Slice(-1) should you get an ArraySegment that contains the first element or should an exception be thrown?

@jamesqo
Copy link
Contributor

jamesqo commented Apr 17, 2017

@karelz, regarding whether ArraySegment<T> should implement IEquatable or not, here is an issue that would have been avoided entirely if it was IEquatable. To see if 2 ArraySegments are equal, xUnit's default comparer checks first if they are IEquatable. Since they aren't, eventually it falls back to treating them like collections and calling GetEnumerator, which throws for a default ArraySegment<T>. I think there is value in implementing the interface.

@karelz
Copy link
Member

karelz commented Apr 17, 2017

@jamesqo I don't have much experience with IEquatable interface. What you say makes sense, but I let btter experts to make the call. It might be easier to move it into separate issue as it needs API approval.

@KrzysztofCwalina you said you really want this to happen in .NET Core 2.0 - can you please help drive it (incl. answering @jamesqo's questions), if it is still true? Thanks! (sorry I don't have enough expertise on this type + not enough bandwidth lately :()

@danmoseley
Copy link
Member

@karelz below are currently implemented in coreclr but not exposed in corefx. Ideally we would have tests and expose them. Some of this was implemented by @jamesqo, some by another contributor who has left.

     public struct ArraySegment<T> : ICollection<T>, IEnumerable, IEnumerable<T>, IList<T>, IReadOnlyCollection<T>, IReadOnlyList<T> {
        public static ArraySegment<T> Empty { get; }
        public T this[int index] { get; set; }
        public void CopyTo(ArraySegment<T> destination);
        public void CopyTo(T[] destination);
        public void CopyTo(T[] destination, int destinationIndex);
        public ArraySegment<T>.Enumerator GetEnumerator();
        public static implicit operator ArraySegment<T> (T[] array);
        public ArraySegment<T> Slice(int index);
        public ArraySegment<T> Slice(int index, int count);
        public T[] ToArray();
        public struct Enumerator : IDisposable, IEnumerator, IEnumerator<T> {
            public T Current { get; }
            object System.Collections.IEnumerator.Current { get; }
            public void Dispose();
            public bool MoveNext();
            void System.Collections.IEnumerator.Reset();
        }
    }

@danmoseley
Copy link
Member

/cc @stephentoub fyi

@stephentoub
Copy link
Member

They're being exposed here: dotnet/corefx#17033

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 2.0.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Collections help wanted [up-for-grabs] Good issue for external contributors wishlist Issue we would like to prioritize, but we can't commit we will get to it yet
Projects
None yet
Development

No branches or pull requests