[BUG] Initialization vs assignment in a loop
ntrel opened this issue · comments
To Reproduce
main: () =
{
i := 0;
p: std::unique_ptr<int>;
while i < 3 next i++ {
std::cout << i << "\n";
p = new<int>(i);
std::cout << p* << "\n";
}
}
The p = new<int>
line generates a call to p.construct
, which works for the first iteration to initialize p
. Then on the second iteration, an assignment was intended, but p.construct
is called again, which causes a contract violation.
0
0
1
Contract violation
terminate called without an active exception
Aborted (core dumped)
git cppfront, g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
This should be rejected, just like https://cpp2.godbolt.org/z/8ssxKs4YT is:
main: () = {
p: std::unique_ptr<int>;
if true {
p = new<int>(i);
}
}
main.cpp2...
main.cpp2(2,3): error: local variable p must be initialized on both branches or neither branch
main.cpp2(3,3): error: "if" initializes p on:
branch starting at line 3
but not on:
implicit else branch
==> program violates initialization safety guarantee - see previous errors
This should be rejected
For consistency, yes. But in both cases it would be nice to only error if the variable is actually used after the branch. If it is only used in the branch where it is initialized, that could be allowed & would be useful e.g. when a pointer (declared in the same scope as the variable) is set to point at the variable after the variable is initialized. BTW the main readme mentions P1179, perhaps that will allow this and #440 as well.
Yes, I agree this is a bug, lazy initialization shouldn't be allowed in a loop (loops can be entered zero times).
it would be nice to only error if the variable is actually used after the branch.
It would be possible to make this work, but wouldn't that be equivalent to suppressing an unused variable warning?
OK, fixed.
The following errors are now diagnosed:
error1: () = {
i: int;
while true {
i = 42; // ERROR: can't initialize i in a loop
}
i = 42;
}
error2: () = {
i: int;
if true {
while true {
i = 42; // ERROR: can't initialize i in a loop
}
i = 42;
}
else {
i = 42;
}
i = 42;
}
And the following is allowed:
ok: () = {
i: int;
if true {
i = 42;
while true { // OK: in-branch loop is after initialization
i = 42;
}
}
else {
i = 42;
}
i = 42;
}
Also, uses like this one already in reflect.h2
continue to work:
protected parse_statement: ( /*...*/ )
-> (ret: std::unique_ptr<statement_node>) // LAZILY INITIALIZED
= {
/*... code that doesn't mention ret ... */
if /*...*/ {
while /*... code that doesn't mention ret ... */ { // OK: this loop doesn't matter
/*... code that doesn't mention ret ... */
}
}
/*... code that doesn't mention ret ... */
ret = parser.parse_one_declaration( // OK: definite first use of ret is a construction
tokens*.get_map().begin()*.second,
generated_tokens*
);
/*...*/
}
So this is invalid C++2?
error1: () = {
i: int;
while true {
s = input_from_user(); // string
if s.is_int() {
i = s.to_int(); // ERROR: can't initialize i in a loop??
break;
}
// ask again
}
// Is this point flagged as an error too?
}
Oh, is the reasoning here that the user can initialize i
where it's declared to some default value, and that's easier than trying to distinguish between construction and assignment in a loop?
So this is invalid C++2?
[... example where line 6 is:]
i = s.to_int(); // ERROR: can't initialize i in a loop??
Correct, and is now diagnosed with the above recent commit:
test.cpp2(6,13): error: local variable i cannot be initialized inside a loop
==> program violates initialization safety guarantee - see previous errors
// Is this point flagged as an error too?
No because you get the earlier error. But if you had code that didn't have the error but then didn't use a the variable you would get an error from the Cpp1 compiler if "unused variable" warnings are on.
that's easier than trying to distinguish between construction and assignment in a loop?
The latter isn't just harder, I think it's not possible for for
or while
(but possibly could work for do
).
The primary reason is that a for
or while
loop could execute zero times, so it might never have an opportunity to initialize at all.
If somehow we guaranteed at-least-once loop semantics (such as do
does, or by inventing a for_at_least_once
and while_at_least_once
or similar), we could allow initialization inside a loop at the cost of generating an additional first_iteration
local variable and then generating { auto&& __rhs = s.to_it(); if (first_iteration) { i.construct(CPP2_FORWARD(__rhs)); } else { i = CPP2_FORWARD(__rhs); }
. That's possible but I haven't found a reason to implement that flexibility. Plus it costs 0-2 extra variables (modulo optimizations) and a branch, though that's less of a concern because you would pay for the overhead only if you use it.
It would be possible to make this work, but wouldn't that be equivalent to suppressing an unused variable warning?
Not if it is initialized and used in the branch. So why not declare the variable in the branch? The example shows why - though in real code the type of the variable might be more complex - some complicated template instantiation.
main: () = {
i: int;
if e1 {
// other code
if e2 {
// initialize and use i
}
// no use of i
}
else if e2 {
// initialize and use i
}
// no use of i
}
Not if it is initialized and used in the branch. So why not declare the variable in the branch?
I agree with both parts, including "why not declare the variable in the branch?" That is, in a case like this why wouldn't it be better to write it as follows?
main: () = {
if e1 {
// other code
if e2 {
// initialize and use i
i := 42; std::cout << i;
}
// no use of i
}
else if e2 {
// initialize and use i
i := 42; std::cout << i;
}
// no use of i
}
Now it's correct by construction; you can't make the mistake of using i
later because its name isn't even in scope.
The purpose of declaring a local in a larger scope is so that it can be used later in the scope (in both Cpp2 and Cpp1), and allowing it to have no initializer allows initializing it in a nested branch scope first including to use different constructors etc. (in Cpp2). If it's only going to be used in the branch scope, shouldn't it be declared there?
in real code the type of the variable might be more complex - some complicated template instantiation
Well you can alias that type and still declare the variables in the innermost scope.
in real code the type of the variable might be more complex - some complicated template instantiation
Well you can alias that type and still declare the variables in the innermost scope.
I agree with @jcanizales but I now see I had missed @ntrel 's point originally... thanks Jorge for fixing my reply with a better shorter one, and my apologies Nick that I missed your clearly stated point the first time!