data() on an empty vector

Question

data() on an empty vector

Quuxplusone opened this issue 7 years ago · comments

Arthur O'Dwyer commented 7 years ago

data() on a vector of zero size but non-zero capacity:

implementation returns the correct buffer pointer
proposal does not impose any requirements
This matches the precedent set by std::vector: wording leaves it unspecified, vendors in practice return the buffer pointer.
I'd like to see the proposal require that the buffer pointer is returned (i.e. bring into line with the implementation rather than leaving it underspecified).

data() on a vector of zero capacity:

implementation returns nullptr
proposal does not impose any requirements
This matches the precedent set by std::vector: wording leaves it unspecified, vendors in practice return nullptr.
Therefore this seems fine to me. (However, it would not be bad to require that nullptr be returned, rather than leaving it underspecified.)

Vicente J. Botet Escriba · Answer 1 · Sun Oct 29 2017 14:42:59 GMT+0800 (China Standard Time)

What are the use cases for a vector of capacity 0?

Why do you want to get the pointer of a vector of size zero? What do you want to do with? A use case will help to understand the need.

Arthur O'Dwyer · Answer 2 · Sun Oct 29 2017 15:09:44 GMT+0800 (China Standard Time)

This has come up on std-proposals many times in the past.
https://groups.google.com/a/isocpp.org/d/msg/std-proposals/sKy7fz_0ERQ/zyw_nQNzBgAJ
https://groups.google.com/a/isocpp.org/d/msg/std-proposals/M3MfGtYMGyY/ZbZ9K2FK9PUJ
https://groups.google.com/a/isocpp.org/d/msg/std-discussion/aR5mlaBtDj4/zz0Wfypk-WgJ

The main argument in favor of a well-defined data method is definitely just that undefined behavior sucks and well-defined behavior is awesome. Some ways that undefined behavior sucks include...

relying on it renders your code non-portable
it's harder to teach a special case ("data always returns the buffer unless the vector has size zero, in which case it's undefined") than to teach a general rule ("data always returns the buffer")
someone will inevitably ask, "Well, if doing it that way is undefined behavior, then what should I do instead?" and we don't have a good answer for that right now

However, the usage scenario for vector that always comes up in these threads is something like this:

std::vector<T> vec;
vec.reserve(256);
c_api_register_base_address(vec.data());  // UNDEFINED BEHAVIOR!
for (int i=0; i < n; ++i) {
    vec.resize(0);
    while (vec.size() < vec.capacity()) {
        vec.push_back(next_element());
    }
    c_api_process_256_elements();
}

The above code already does the right thing on all vendors' implementations, but the marked line technically has undefined behavior according to the Standard wording for vector. It would be nice if the Standard wording were brought into line with reality — and, relevant to this proposal, it would be nice if we did not add new wording that repeats the mistakes of the past.

gnzlbg · Answer 3 · Sun Oct 29 2017 17:25:17 GMT+0800 (China Standard Time)

A previous version of the proposal required this, but array and vector do not.

Right now you can teach the same thing (with the special case) for all ContiguousContainers. If we make this change, you will need to teach another special case (that fixed_capacity_vector is different).

IMO this should be changed for all ContiguousContainers. If you care about this, write a small paper. I'll be happy to give you feedback on it.

So TL;DR: I agree that data for vector, array, and fixed_capacity_vector should return nullptr when the container has no buffer: begin() == end() == data() == nullptr. However, the objective of this proposal is to add a container to the standard, not to improve this behavior for all affected containers. So I can't really do that here :/

Arthur O'Dwyer · Answer 4 · Mon Oct 30 2017 02:29:13 GMT+0800 (China Standard Time)

However, the objective of this proposal is to add a container to the standard, not to improve this behavior for all affected containers.

That's fair. I think there's precedent for both attitudes: either "We're adding a new feature; let's imitate the old stuff as much as possible" (yours in this case) or "We're adding a new feature; this is our chance to 'fix' things even at the cost of inconsistency" (mine in this case).

Examples of the latter include how C++11 uniform initialization syntax took the opportunity to nail down the order of evaluation for braced-initializers, or how C++17 optional and variant took the opportunity to provide rvalue-ref-qualified accessors even though C++03 vector and C++11 unique_ptr didn't. Examples of the former are too numerous to list. 🙂

Vicente J. Botet Escriba · Answer 5 · Mon Oct 30 2017 05:08:30 GMT+0800 (China Standard Time)

@gnzlbg +1 for an independent paper that solve the whole problem if any.

@Quuxplusone If we have a real problem we need a paper. Having a fix for fixed_capacity_vector will not solve your problem, as you wouldn't have it fixed for vector.

If you consider that other containers should profit from rvalue-ref-qualified accessors, you will need as well a paper.

gnzlbg · Answer 6 · Mon Oct 30 2017 18:42:48 GMT+0800 (China Standard Time)

I thought a bit more about this, and one thing you need to consider is allocators with a custom pointer type. That is, std::vector currently supports those, and I don't know in how far these custom pointer types can be nullptr. That might be one of the reasons why the current specification is worded the way it is, even though for std::allocator all implementation use nullptr in this case. Obviously ,for fixed_capacity_vector this argument is moot because it doesn't have an Allocator. So if that is the only reason why std::vector does it the way it does, maybe we can fix it here.

gnzlbg · Answer 7 · Mon Jan 08 2018 17:29:02 GMT+0800 (China Standard Time)

So I'm closing this since there is nothing I can do about it in this proposal. I think many would agree with a proposal to specify this behavior so please if you are interested in writing one do so (I can give it a read)