Array Updates #633

joshessman-llnl · 2021-08-09T12:13:06Z

From #627

I envision making the following changes in each PR (roughly in order)

Extending Array to multiple dimensions - Extend Array to multiple dimensions #627
Replacing MCArray with Array (now that multiple dimensions are possible - Remove the old MCArray from Sidre #657
Adding a View (ArrayView?) class that provides indexing/access to a pointer it does not own, while removing the external data feature from Array - Replace Array external buffer functionality with ArrayView #678
Adding GPU support via a template parameter for the memory space of the Array/ArrayView - this needs to include a dynamic option so the memory space can be set at runtime - Full GPU support for axom::Array #695

Other (smaller?) feature requests/tasks:

The text was updated successfully, but these errors were encountered:

joshessman-llnl · 2021-09-16T18:46:33Z

Some caveats/gotchas/pitfalls that have come up as I've been working through the MCArray -> Array replacement:

MCArray functions like size()/capacity()/etc returned the number of tuples and not the number of elements. In the n-dimensional case this would probably correspond to the number of rank n-1 subtensors. I've chosen to instead match numpy.ndarray and unconditionally return the number of elements - see https://numpy.org/doc/stable/reference/generated/numpy.ndarray.size.html
emplace as implemented in MCArray doesn't really correspond to the idea of emplace in the sense that the STL uses it, so I've opted not to add that functionality to Array. If it's needed my preference would be to pick a different name for it. Additionally, the MCArray::emplace interface is a little vague in my opinion - that is, it might not necessarily be clear to someone what it does on first glance. Do any widely used numerical container classes provide similar functionality?
The semantics of "raw" insert() methods are pretty straightforward in the 1D case but significantly less so in the multidimensional case - if someone wants to insert of a single value (or an array of values of arbitrary length) at some arbitrary position in a multidimensional container, how would that affect the extents/dimensions/strides? My sense is we'd want to disallow it in the multidimensional case. I have implemented an insert(IndexType pos, Array other) (and a shortcut append(Array other)) that would allow the insertion of nonscalar data:

int[] values = {1, 2, 3, 4}; // 2x2 {{1,2},{3,4}}
Array<int, 2> arr(1, 2); // create a 2D array with extents 1x2
arr(0,0) = 5;
arr(0,1) = 6;
// TODO: this will be an ArrayView since we're only temporarily wrapping a pointer
arr.insert(0, Array<int, 2>(values, 2, 2)); // temporary 2D array wrapping "values" with extents 2x2
// will print [ 1 2 3 4 5 6 ]
std::cout << arr << "\n";

The above method requires that the array argument be of the same dimension of the calling object and have identical extents in all but the leading dimension (such that you could insert a 2x2 into a 3x2 to get a 5x2). Currently the "units" of the pos argument are number of elements but I think we'd either have to check that it aligns with a "block boundary" e.g. the above insertion would only make sense if pos was a multiple of 2 - or change the "units" to the size of the "rank n-1 subtensor" I describe above.

samuelpmishLLNL · 2021-09-16T20:24:56Z

MCArray functions like size()/capacity()/etc returned the number of tuples and not the number of elements. In the n-dimensional
case this would probably correspond to the number of rank n-1 subtensors. I've chosen to instead match numpy.ndarray and
unconditionally return the number of elements - see https://numpy.org/doc/stable/reference/generated/numpy.ndarray.size.html

I'm tempted to side with MCArray's interpretation, here. To me, container::size() is to C++ as len() is to python (i.e. the idiomatic way of getting the number of things in a list), and len(my_numpy_array) does return the "number of rank n-1 subtensors", not the total number of contained elements. So, if we're looking to numpy for inspiration, it might make more sense to copy their conventions rather than their naming directly.

Of course, it's important to be able to get the individual dimensions as well as the total number of items contained in the array.

samuelpmishLLNL · 2021-09-16T20:30:05Z

Regarding points (2) and (3), I don't expect to do any inserting, appending, or emplacing in the multidimensional case at all. So, if it were up to me, I wouldn't even worry about implementing those except in the 1D case (and then, if users decide they are essential after all, they could be added later).

white238 · 2021-09-16T21:13:39Z

My thoughts on 1.

I find it odd you would want to know how many tuples are in a multidimensional array. I would expect if i ask the size of a 2x2 array, that I would receive the answer 4 elements not 2 tuples. This just gets more confusing if you do a 3x3x3, is it 3, because that's the top level amount of tuples, or 9 tuple instead of 27.

I agree if this was an array of containers, I would expect the size to be the number of those containers. But the fact that it is tuples is an internal implementation detail, not something the user of the class should care about.

joshessman-llnl · 2021-09-16T21:31:48Z

@samuelpmishLLNL Your point on len is a good one - I did want to clarify that regardless of what size() does that "number of rank n-1 subtensors" is always accessible via arr.shape()[0] - additionally when Views are introduced we could also have begin()/end() iterate over those subtensors instead of individual elements

As for inserting/appending/emplacing - it looks like at least the inserting/appending gets used fairly frequently by users of MCArray, though it's often tricky to tell if it's actually doing a multidimensional insertion or if it's actually a "trivially multidimensional" array where the trailing dimension is 1. Emplacing for a MCArray is roughly "inserting x tuples at y position where each element of each tuple has the value z" - though this looks like it only gets used a handful of places in mint.

white238 · 2021-09-17T06:02:49Z

My two cents for what its worth... I personally feel size() and shape() serve different use cases in this class.

Given the fact that this is a multidimensional array, would a user expect to use size() in for loop or ask for the specific dimension via shape()?

With a 2d array:

for(int i = 0; i < arr.size(); ++i) { // size is number of tuples in this case
   std::tuple t = arr[i];
   for(int j = 0; j < t.size(); ++i) {
      std::cout << t[j] << std::endl;
   }
}

or:

for(int i = 0; i < arr.shape()[0]; ++i) {
   for(int j = 0; j < arr.shape()[1]; ++i) {
      std::cout << arr(i,j) << std::endl;
   }
}

kennyweiss · 2021-09-20T20:18:18Z

I agree with @white238 here. Size should be the product of the dimensions.

See also: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.size.html

EDIT: And @joshessman-llnl's original link (I should have read closer)
From personal experience, the original MCArray API was clunky to work with -- it was never clear when we were referring to the number of tuples of the full array.

samuelpmishLLNL · 2021-09-21T02:50:37Z

I don't want to belabor this discussion too much, but it may be worth looking to the mdspan proposal for inspiration on naming, since that is imminently coming to C++ and aims to accomplish a very similar task. size and other members of an mdspan are defined unambiguously in the proposal:

joshessman-llnl · 2021-09-30T15:19:19Z

The decision to temporarily punt on removing MCArray from mint was made in the interest of not having that task hold up further development.

I am not completely sure of this but I would imagine the vast majority of that effort would be characterized by the following:

Changing calculations that would be off by a factor of num_components now that size() et al operate in terms of number of elements
Replacing the nonstandard (in terms of the numpy and STL containers that Array is intended to follow where possible) methods like MCArray::emplace that get used in mint

publixsubfan · 2021-11-09T21:10:23Z

I noticed that ArrayView<T>'s constructor requires a non-const reference to Array<T>, because ArrayView<T> doesn't really protect the underlying data from being modified.

As a workaround, I think we should consider adding the ability to construct ArrayView<const T> from a const Array<T>. This would be akin to how std::span<const T> preserves immutability of the underlying data.

joshessman-llnl added Core Issues related to Axom's 'core' component GPU Issues related to GPU development labels Aug 9, 2021

joshessman-llnl self-assigned this Aug 9, 2021

joshessman-llnl changed the title ~~Array Updats~~ Array Updates Aug 9, 2021

joshessman-llnl mentioned this issue Aug 9, 2021

Extend Array to multiple dimensions #627

Merged

rhornung67 added the Reviewed label Aug 9, 2021

joshessman-llnl mentioned this issue Sep 22, 2021

Remove the old MCArray from Sidre #657

Merged

2 tasks

rhornung67 added this to the FY22 Development milestone Oct 12, 2021

joshessman-llnl mentioned this issue Nov 8, 2021

Full GPU support for axom::Array #695

Merged

2 tasks

rhornung67 modified the milestones: FY22 Development, Feb 2022 Release Nov 8, 2021

This was referenced Nov 17, 2021

Switch over to axom::Array LLNL/serac#596

Merged

Adds axom::Array::view() method and improves testing with kernels #725

Merged

Initialize memory in axom::Array by default, allow for creation of ArrayView<const T> #709

Merged

kennyweiss mentioned this issue Nov 23, 2021

Semantics of ArrayView and const #732

Closed

rhornung67 modified the milestones: March 2022 Release, FY22 Development Feb 14, 2022

rhornung67 removed this from the March 2022 Release milestone Mar 21, 2022

rhornung67 added this to the FY22 Development milestone Mar 21, 2022

kennyweiss mentioned this issue Apr 12, 2022

Miscellaneous axom::Array improvements #808

Merged

kennyweiss assigned publixsubfan and unassigned joshessman-llnl Mar 20, 2023

kennyweiss modified the milestones: FY22 Development, FY23 Development Mar 20, 2023

rhornung67 modified the milestones: FY23 Development, April 2024 Release Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Array Updates #633

Array Updates #633

joshessman-llnl commented Aug 9, 2021 •

edited by publixsubfan

Loading

joshessman-llnl commented Sep 16, 2021 •

edited

Loading

samuelpmishLLNL commented Sep 16, 2021

samuelpmishLLNL commented Sep 16, 2021

white238 commented Sep 16, 2021 •

edited

Loading

joshessman-llnl commented Sep 16, 2021

white238 commented Sep 17, 2021 •

edited

Loading

kennyweiss commented Sep 20, 2021 •

edited

Loading

samuelpmishLLNL commented Sep 21, 2021

joshessman-llnl commented Sep 30, 2021 •

edited

Loading

publixsubfan commented Nov 9, 2021

Array Updates #633

Array Updates #633

Comments

joshessman-llnl commented Aug 9, 2021 • edited by publixsubfan Loading

joshessman-llnl commented Sep 16, 2021 • edited Loading

samuelpmishLLNL commented Sep 16, 2021

samuelpmishLLNL commented Sep 16, 2021

white238 commented Sep 16, 2021 • edited Loading

joshessman-llnl commented Sep 16, 2021

white238 commented Sep 17, 2021 • edited Loading

kennyweiss commented Sep 20, 2021 • edited Loading

samuelpmishLLNL commented Sep 21, 2021

joshessman-llnl commented Sep 30, 2021 • edited Loading

publixsubfan commented Nov 9, 2021

joshessman-llnl commented Aug 9, 2021 •

edited by publixsubfan

Loading

joshessman-llnl commented Sep 16, 2021 •

edited

Loading

white238 commented Sep 16, 2021 •

edited

Loading

white238 commented Sep 17, 2021 •

edited

Loading

kennyweiss commented Sep 20, 2021 •

edited

Loading

joshessman-llnl commented Sep 30, 2021 •

edited

Loading