aantron / better-enums

C++ compile-time enum to string, iteration, in a single header file

Home Page:http://aantron.github.io/better-enums

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Are there guarantees on size with better enums?

jaskij opened this issue · comments

I'm in a weird situation where, because of std::vector<bool> bit packing and a need to interface with a C API I need to make my own boolean type (the other alternative being using a non-standard container, like boost::container::vector).

Because of that C API I need a guarantee that sizeof(BETTER_ENUM(foo, char)) == sizeof(char), does Better Enums provide such a guarantee?

Using a simple test code (below) this seems to be true, but the actual implementation is somewhere so deep among the macros it's difficult to analyze.

Test code:

#include <iostream>

#include <better-enums/enum.h>

BETTER_ENUM(mbool, char, mfalse = 0, mtrue = 1)

int main() {
    std::cout << sizeof(mbool) << ' ' << sizeof(char) << std::endl;
    std::cout << std::boolalpha << (sizeof(mbool) == sizeof(char)) << std::endl;

    return 0;
}

Edit: I'm aware that enum class mbool : char would work for this use case, but Better Enums are, well, better.

Practically yes, in BETTER_ENUM(Foo, T), the generated class will have only one data member, of type T. So unless some exotic (and. I think, non-standard) packing rules come into play on some compiler or other, the enum should have the same size as T.

You can inspect it without macros by dumping the output of the preprocessor (-E on compilers with gcc-like interfaces).

If curious, the member is defined here:

_integral _value; \

its type here:

typedef Underlying _integral; \

with the token Underlying in the expansion of

better-enums/enum.h

Lines 634 to 637 in f3ff0a6

#define BETTER_ENUMS_TYPE(SetUnderlyingType, SwitchType, GenerateSwitchType, \
GenerateStrings, ToStringConstexpr, \
DeclareInitialize, DefineInitialize, CallInitialize, \
Enum, Underlying, ...) \

and Underlying always gets passed through the BETTER_ENUM macro unchanged, as one of the last parameters in each expansion (which otherwise are selecting other details based on compiler version and other settings):

better-enums/enum.h

Lines 1190 to 1230 in f3ff0a6

// Top-level macros.
#define BETTER_ENUM(Enum, Underlying, ...) \
BETTER_ENUMS_ID(BETTER_ENUMS_TYPE( \
BETTER_ENUMS_CXX11_UNDERLYING_TYPE, \
BETTER_ENUMS_DEFAULT_SWITCH_TYPE, \
BETTER_ENUMS_DEFAULT_SWITCH_TYPE_GENERATE, \
BETTER_ENUMS_DEFAULT_TRIM_STRINGS_ARRAYS, \
BETTER_ENUMS_DEFAULT_TO_STRING_KEYWORD, \
BETTER_ENUMS_DEFAULT_DECLARE_INITIALIZE, \
BETTER_ENUMS_DEFAULT_DEFINE_INITIALIZE, \
BETTER_ENUMS_DEFAULT_CALL_INITIALIZE, \
Enum, Underlying, __VA_ARGS__))
#define SLOW_ENUM(Enum, Underlying, ...) \
BETTER_ENUMS_ID(BETTER_ENUMS_TYPE( \
BETTER_ENUMS_CXX11_UNDERLYING_TYPE, \
BETTER_ENUMS_DEFAULT_SWITCH_TYPE, \
BETTER_ENUMS_DEFAULT_SWITCH_TYPE_GENERATE, \
BETTER_ENUMS_CXX11_FULL_CONSTEXPR_TRIM_STRINGS_ARRAYS, \
BETTER_ENUMS_CONSTEXPR_TO_STRING_KEYWORD, \
BETTER_ENUMS_DECLARE_EMPTY_INITIALIZE, \
BETTER_ENUMS_DO_NOT_DEFINE_INITIALIZE, \
BETTER_ENUMS_DO_NOT_CALL_INITIALIZE, \
Enum, Underlying, __VA_ARGS__))
#else
#define BETTER_ENUM(Enum, Underlying, ...) \
BETTER_ENUMS_ID(BETTER_ENUMS_TYPE( \
BETTER_ENUMS_LEGACY_UNDERLYING_TYPE, \
BETTER_ENUMS_DEFAULT_SWITCH_TYPE, \
BETTER_ENUMS_DEFAULT_SWITCH_TYPE_GENERATE, \
BETTER_ENUMS_CXX98_TRIM_STRINGS_ARRAYS, \
BETTER_ENUMS_NO_CONSTEXPR_TO_STRING_KEYWORD, \
BETTER_ENUMS_DO_DECLARE_INITIALIZE, \
BETTER_ENUMS_DO_DEFINE_INITIALIZE, \
BETTER_ENUMS_DO_CALL_INITIALIZE, \
Enum, Underlying, __VA_ARGS__))
#endif

So it's implementation dependent, but most implementations will give the desired result. Considering right now I'm only targeting gcc and a specific version at that I can live with this. Thanks.

A not if anyone ever hits this issue: if you're using C++11, it's best to leave a static_assert beside the enum to get a compilation error if the size changes.

So my code above would change to:

#include <iostream>

#include <better-enums/enum.h>

BETTER_ENUM(mbool, char, mfalse = 0, mtrue = 1)
static_assert(sizeof(mbool) == sizeof(uint8_t), "BETTER_ENUM is the wrong size!");

int main() {
    std::cout << sizeof(mbool) << ' ' << sizeof(char) << std::endl;
    std::cout << std::boolalpha << (sizeof(mbool) == sizeof(char)) << std::endl;

    return 0;
}

@jaskij Thanks.

About the note, I don't know about best — it's extremely unlikely that the size would be different from the underlying type. However, it can still be good, if the size of something is particularly important and you want to be extra sure. I do, of course, use plenty of static_asserts for paranoia and documentation reasons, of course, like (I think) most people :)

So it's implementation dependent, but most implementations will give the desired result. Considering right now I'm only targeting gcc and a specific version at that I can live with this. Thanks.

I'm not aware of any implementations where this property doesn't hold, and, as I recall (I'm rusty), such an implementation would be violating standards. However, over the years, there have been implementations that violated standards, had bugs, etc., and Better Enums doesn't take any special measures to work around anything like that — that's the only sense in which, and reason why, I did not outright say that Better Enums guarantees that the size will be equal. In practice, it is equal. Saying it is "implementation dependent" gives the wrong impression, since the phrase is usually used when something is actually known to vary.

To add more, I'm not aware of any implementation ever where the size would have been different.

@aantron

such an implementation would be violating standards

Nope, padding is strictly left to the compiler. The uint8_t (or char) member must be 1-byte, but the class object itself doesn't have to be.

There are some architectures (iirc older ARM) which don't have unaligned access, meaning all memory access must be aligned to word boundary. In that case it's conceivable that specific optimization flags could enable padding the class to a multiple of word size. Unlikely, but possible.

Edit:

(not only) ARM microcontrollers (such as used in some Arduinos) also don't have unaligned access, so there too. I'd expect the typical case to be small size and extra instructions to extract unaligned objects, but we can't be sure.

About the note, I don't know about best
I meant it's best given the context - caring about the size of my BETTER_ENUM.

@aantron I took a closer look at your API, and BETTER_ENUMS_CLASS_ATTRIBUTE with the packed attribute will do nicely for me, to make gcc behave exactly the way I want.

Here's a Stack Overlfow question explaining the pitfalls

Nope, padding is strictly left to the compiler. The uint8_t (or char) member must be 1-byte, but the class object itself doesn't have to be.

I don't think it's strictly up to the compiler — there are some constraints, the compiler can make decisions within those constraints, but Better Enums should be outside all those constraints anyway.

The compiler, AFAIK, cannot (if it is standards-compliant) just insert completely arbitrary padding.

As I have always understood it (could be wrong), this padding only can optionally be inserted between fields and after fields, if there are other fields in the struct/class that would force access to one of the subsequent members or a next struct to inherently be misaligned if the previous field/struct access is aligned. So, at least in cases I know of, this wouldn't apply to a one-field class like a Better Enum.

For trailing padding for architectures that require aligned access, I indeed don't know what will happen with a 1-field class. It doesn't inherently have any issue that the basic underlying type wouldn't have, e.g. a 4-byte alignment requirement at a basic level should affect an array of char enums and an array of chars in the same way. However, would the compiler insert 3 trailing bytes, rather than pack the enums in the array and emit extra instructions? I don't have a ready way of checking, so a static_assert indeed might be a good idea, if your code might be built for this kind of system.

However, I also vaguely remember of hearing something about systems on which sizeof(uint8_t) was either reported as more than 1, or reported as 1 but actually packed into much larger elements in arrays — my memory is hazy on this. So if that's the case, depending on the situation, even the assert may or may not be able to save the user, in case the compiler ends up packing these types differently into arrays even while reporting the size as the same, because it considers one to be a basic type, and the other a composite.

As I have always understood it (could be wrong), this padding only can optionally be inserted between fields and after fields, if there are other fields in the struct/class that would force access to one of the subsequent members or a next struct to inherently be misaligned if the previous field/struct access is aligned. So, at least in cases I know of, this wouldn't apply to a one-field class like a Better Enum.

You're right, my bad.

And I went overboard with that alignment requirement description - the address must be aligned to multiple of access size. Which is still 1. You can't access uint16_t at an odd address, but you can uint8_t (assuming regular sizes). This happened to me once, where I passed an odd address to a function call which silently cast it to uint16_t*.

So my static_assert is most likely completely meaningless here.

However, I also vaguely remember of hearing something about systems on which sizeof(uint8_t) was either reported as more than 1, or reported as 1 but actually packed into much larger elements in arrays — my memory is hazy on this. So if that's the case, depending on the situation, even the assert may or may not be able to save the user, in case the compiler ends up packing these types differently into arrays even while reporting the size as the same, because it considers one to be a basic type, and the other a composite.

This sounds like some retro stuff and nothing I ever came in touch with.

Thanks for your patience, this is completely cleared up now ;) and I'll still leave that static_assert cause I'm paranoid and don't want to get bitten.