kokkos / mdspan

Reference implementation of mdspan targeting C++23

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Line 132 of layout_right.hpp seems wrong

riclarsson opened this issue · comments

Hi,

I think the line should simply read decltype(other.stride(0)) stride = 1;. I guess you have that index type declared somewhere but I am not sure where.

The problem seems to only present itself when using the full_extent_t stuff with indices in a non-optimal way. I need this because I am replacing code that has these features. (I also need something like a partial_extent_t but I wrote my own for that for now because I could't find anything in the documentation.)

The most appropriate thing is probably a static cast to index_type (which is a member of the class). I also fixed the missing noexcept which is in the standard but was not in the implementation. The precondition check is now only done with NDEBUG not being defined.

@riclarsson can you check that this is fixed for you? Also what do you mean with partial_extent_t? What is the difference of that to just giving a pair(begin,end)?

About the updated code

@crtrott I've since gone on to not allow turning a strided mdspan back into an exhaustive mdspan because it is simply better to not allow this if possible. That said, I went back to my old code and your update does remove the warning about comparing different index sizes. Thank you very much!

However, I think the solution is not optimal. The following code-snippet demonstrates a new problem:

  std::vector<double> v(6);
  for (std::int64_t i = 0; i < 6; i++)
    v[i] = static_cast<double>(1 + 10 * i);
  for (std::int64_t i = 0; i < 6; i++)
    std::cout << v[i] << ' ';
  std::cout << '\n';

  stdx::mdspan<double, stdx::dextents<std::int64_t, 2>> m(
      v.data(), std::array<std::int64_t, 2>{3, 2});
  for (std::int64_t i = 0; i < 3; i++)
    for (std::int64_t j = 0; j < 2; j++)
      std::cout << m(i, j) << ' ';
  std::cout << '\n';

  stdx::mdspan<double, stdx::dextents<std::int64_t, 1>, stdx::layout_stride>
      vs = stdx::submdspan(m, stdx::full_extent, 0);
  for (std::int64_t i = 0; i < 3; i++)
    std::cout << vs[i] << ' ';
  std::cout << '\n';

  stdx::mdspan<double, stdx::dextents<std::int64_t, 1>> ve(vs);
  for (std::int64_t i = 0; i < 3; i++)
    std::cout << ve[i] << ' ';
  std::cout << '\n';

This compiles and runs to the end using CMake Release flags for both GCC12 and Clang15. The output is:

1 11 21 31 41 51 
1 11 21 31 41 51 
1 21 41 
1 11 21

So the exhaustive mdspan now just takes the first three elements instead of throwing or terminating.

With the RelWithDebInfo CMake mode, the call terminates. It also produces a much nastier warning than I was having with the integer comparison. I honestly think it is a bad idea to mark this particular constructor noexcept to begin with...

About the partial_extent_t

You didn't have, or I didn't find, strided_slice two weeks ago. I can work with that struct going forward. It only lacks one feature, which is a combination with full_extent_t. The most common strided access in our (admittedly old and add-as-you-go) program is to take all the elements at some offset and/or stride. I don't think this is supported by your current strided_slice proposal. Since your strided_slice is templated, it would be nice if its ExtentType could simply be full_extent_t to just mimic the idea of using everything that is available, but I will just work around that for now

Ah I see the warning ... I guess we could just abort/terminate instead of throw.
Regarding full release mode checking: this check is gonna be reasonably expensive hence hiding it behind NDEBUG. I guess thats something to disucss long term.

@crtrott About strided_slice, I am very confused how it is intended to work for negative strides. I spent the rest of this morning trying to port my code to use strided_slice instead of a custom slicer but I cannot get it working. The following:

  std::vector<double> v(3);
  for (std::int64_t i = 0; i < 3; i++)
    v[i] = static_cast<double>(1 + 10 * i);

  stdx::mdspan<double, stdx::dextents<std::int64_t, 1>> ve{
      v.data(), std::array<std::int64_t, 1>{3}};

  auto rvs = stdx::submdspan(
      ve,
      stdx::strided_slice<std::int64_t, std::int64_t, std::int64_t>{2, 3, -1});

  auto rvs2 = stdx::submdspan(
      ve,
      stdx::strided_slice<std::int64_t, std::int64_t, std::int64_t>{2, 2, -2});

  for (std::int64_t i = 0; i < 3; i++)
    std::cout << ve[i] << ' ';
  std::cout << '\n';

  for (std::int64_t i = 0; i < rvs.extent(0); i++)
    std::cout << rvs[i] << ' ';
  std::cout << '\n';

  for (std::int64_t i = 0; i < rvs2.extent(0); i++)
    std::cout << rvs2[i] << ' ';
  std::cout << '\n';

should intuitively output:

1 11 21 
21 11 1 
21 1 

Instead, it only outputs:

1 11 21 

21 

I would have expected that this is what observably happens when the code above is run:

  std::vector<double> v(3);
  for (std::int64_t i = 0; i < 3; i++)
    v[i] = static_cast<double>(1 + 10 * i);

  stdx::mdspan<double, stdx::dextents<std::int64_t, 1>> ve{
      v.data(), std::array<std::int64_t, 1>{3}};

  stdx::strided_slice<std::int64_t, std::int64_t, std::int64_t> rev{2, 3, -1};
  stdx::mdspan<double, stdx::dextents<std::int64_t, 1>, stdx::layout_stride>
      rvs{ve.data_handle() + ve.stride(0) * rev.offset,
          {std::array<std::int64_t, 1>{rev.extent},
           std::array<std::int64_t, 1>{rev.stride * ve.stride(0)}}};

  stdx::strided_slice<std::int64_t, std::int64_t, std::int64_t> rev2{2, 2, -2};
  stdx::mdspan<double, stdx::dextents<std::int64_t, 1>, stdx::layout_stride>
      rvs2{ve.data_handle() + ve.stride(0) * rev2.offset,
           {std::array<std::int64_t, 1>{rev2.extent},
            std::array<std::int64_t, 1>{rev2.stride * ve.stride(0)}}};

  for (std::int64_t i = 0; i < 3; i++)
    std::cout << ve[i] << ' ';
  std::cout << '\n';

  for (std::int64_t i = 0; i < rvs.extent(0); i++)
    std::cout << rvs[i] << ' ';
  std::cout << '\n';

  for (std::int64_t i = 0; i < rvs2.extent(0); i++)
    std::cout << rvs2[i] << ' ';
  std::cout << '\n';

(This code-snippet of course works for any strided mdspan, but is also very ugly because of the pointer semantics arithmetics.)

Note: none of the layouts we got support negative strides. And the strided_slice thing is not currently intended to support it either. See: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2630r2.html
There is a precondition that stride of strided_slice is >0 on all the functions. Generally, we do implement a good chunk of the mandates, but we don't necessarily implement all precondition checks yet (since you don't need to have them to be standard compliant, and we don't have unlimited resources).

This is the relevant line:

- if Sk is a specialization of strided_slice and sk.extent == 0 is false, sk.stride > 0 is true,

@crtrott Ok, so I understand that strided_slice is not intended to work with negative strides. It's a shame I will not be able to use it, but it is very good to know. Are you open to change that or not, considering it is not in the C++ standard yet?

I am more concerned about your other statement, that you do not have a layout that supports negative strides. Can you clarify?

Because your implementation of mdspan does support negative strides with layout_stride. The snippet I put above works using negative strides with the layout_stride (though I see now we have a minor difference in our extent-definition looking through your preconditions more carefully --- you use memory space I use element extent).

Your submdspan implementation also supports at least keeping the negative strides around. I have used this extensively in both my wrapper class's access operator and in my sub-mdspan-iterator implementation.

EDIT: Not to say that I don't agree with the design. I think negative strides are a pain. If I manage to convince people I work with about that is another issue altogether that I am currently attempting.

The C++ standard has a precondition on layout_stride constructors for strides to be larger zero, so its a precondition violation not to do that. If you instantiate it with negative strides there are likely a couple issues:

  1. required_span_size() is wrong:
      stdx::layout_stride::mapping<stdx::dextents<int,1>> map(stdx::dextents<int,1>(10),std::array<int,1>{-1});
      std::cout << map.required_span_size(); // prints -8 ...
  2. If you mix negative and positive strides you are more likely when not to violate the always_unique property.
  3. mappings MUST return positive values between 0 and required_span_size().

However: it is NOT against the rules to write a layout which DOES support negative strides.
Most likely you would need to save some offset though. I.e.

offset = 0;
for(rank_type r=0; r<extents_type::rank(); r++)
   offset -= stride(r)<0 ? stride(r) * (extents().extent(r)-1) : 0;

Required span size is then:

req_span_size = offset;
for(rank_type r=0; r<extents_type::rank(); r++)
   offset += stride(r)>0 ? stride(r) * (extents().extent(r)-1) : 0;

And then the mapping operator would look something equivalent to this:

template<class ... Indicies>
index_type operator()(Indicies ... idxs) const {
  array ii{static_cast<index_type>(idxs)...};
  index_type idx = offset;
  for(rank_type r = 0; r<extents_type::rank(); r++) idx += ii[r] * stride(r);
  return ii;
}

This way you return something between zero and some max value and it all hangs together. I am not sure what exactly the determination is to figure out whether the layout is unique. I think it might be the same as for std::layout_stride but just using the abs values of the strides?

Anyway does that make all sense?

So the reasons we didn't support negative stride are in combination:

  • need to store an extra value
  • generally the layout could be non-unique more easily
  • and all in all it was hard enough to get mdspan into the standard as is. We can certainly try to make it incrementally more complicated, but we needed to essentially get the minimally viable set in - and many on the committee consider mdspan the most complicated thing which made it into C++23.

I think we resolved this issue to the degree we can. Gonna close the issue.