`|= empty` does not work as expected in v1.6
kpym opened this issue · comments
Describe the bug
Following the documentation |= empty
is supposed to delete the element.
But when multiple elements are set to empty it does not work properly.
To Reproduce
query: (.[] | select(. >= 2)) |= empty
input: [1,5,3,0,7]
output: [1,3,7]
Expected behavior
output: [1,0]
Environment:
jqplay
Note
- If you replace
>=2
with>=7
then7
is deleted. - If you replace
>=2
with>=5
then only5
is deleted.
$ jq-1.6 -cn '[1,5,3,0,7]|(.[]) |= empty'
[5,0]
$ jq-1.6 -cn '[1,5,3,0,7]|(.[range(length - 1; -1; -1)]) |= empty'
[]
$ jq-1.6 -cn '[1,5,3,0,7]|(.[range(length - 1; -1; -1)]|select(.>=2)) |= empty'
[1,0]
$
Wow, yeah, this is surprising. |= empty
works for objects, but for arrays... it's broken, and the reason its broken is that the paths being modified are computed on a copy of the array prior to the edits, while the edits (removals) remove items from a working copy, which changes the valid indices. Iterating backwards works around the problem.
I'm not sure that there's any way to fix this.
I have no idea of the code, but yes deleting elements from array inside a loop should always be done backwards !
There is another possibility : attribute the special value empty
and raise a flag, and then if the flag is raised apply del
to all empty
elements. By the way del
works witout problems on filtered arrays.
Ah, so here's the notional fix: when in the context of path(exp)
, EACH
and EACH_OPT
need to iterate arrays backwards. And they can know that they are being executed in the context of path/1
.
diff --git a/src/execute.c b/src/execute.c
index e30ee01..39d6e78 100644
--- a/src/execute.c
+++ b/src/execute.c
@@ -865,8 +865,8 @@ jv jq_next(jq_state *jq) {
keep_going = idx < len;
is_last = idx == len - 1;
if (keep_going) {
- key = jv_number(idx);
- value = jv_array_get(jv_copy(container), idx);
+ key = jv_number(jq->subexp_nest ? idx : len - (idx + 1));
+ value = jv_array_get(jv_copy(container), jq->subexp_nest ? idx : len - (idx + 1));
}
} else if (jv_get_kind(container) == JV_KIND_OBJECT) {
if (opcode == EACH || opcode == EACH_OPT) idx = jv_object_iter(container);
fixes it:
$ ./jq -cn '[[1,5,3,0,7]|path(.[])]'
[[4],[3],[2],[1],[0]]
$ ./jq -cn '[1,5,3,0,7]|(.[] | select(. >= 2)) |= empty'
[1,0]
$
ah, but it's not right:
$ ./jq -cn '[1,5,3,0,7]|.[]'
7
0
3
5
1
The problem is that jq->subexp_nest == 0
if we're either not at all in path/1
context, or we're in path/1
context but not in a "subexp". We need a bit more state.
How about delaying delpaths
?
def _modify(ps; f):
reduce path(ps) as $p
([., []]; label $out | (setpath([0] + $p; getpath([0] + $p) | f) | ., break $out), .[1] += [$p])
| . as $x | $x[0] | delpaths($x[1]);
I have some unexpected behaviour on using select
to the right hand side of a |=
which might be another example of the same issue but I'm not quite sure. Adapted from the example in the OP I have:
Input: [1,5,3,0,7]
Filter: .[] |= select(. >= 2)
Output: [5,3,7]
(as expected)
but If I change the select to the following: (https://jqplay.org/s/_6xd9PGoXN)
Input: [1,5,3,0,7]
Filter: .[] |= select(. == 2)
Output: [5,0]
Is this an instance of the same issue or something different?
Additionally, [range(10)] | .[] |= select(. <= 3)
will skip checking the next value after one evaluates to empty and will add a null
to the end of the array for every "emptied" element.
$ jq -cn '[range(10)] | .[] |= select(. <= 3)'
[0,1,2,3,5,7,9,null,null,null]
$ jq -cn '[range(10)] | .[] |= if . <= 3 then . else empty end'
[0,1,2,3,5,7,9,null,null,null]
So it does not resize the array and it shifts the elements by one after emptying an element resulting in the element after an emptied element being skipped.
[range(10)] | .[] |= empty
has a similar shifting behaviour, but the "emptied" elements are actually removed from the array instead of turning into nulls at the end of the array:
$ jq -cn '[range(10)] | .[] |= empty'
[1,3,5,7,9]
Just to mention that gojq do not have this issue.
-
> jq -cn '[1,5,3,0,7]|(.[]) |= empty' [5,0] > gojq -cn '[1,5,3,0,7]|(.[]) |= empty' []
-
> jq -cn '[range(10)] |.[] |= select(. == 2)' [1,2,4,6,8] > gojq -cn '[range(10)] |.[] |= select(. == 2)' [2]
-
>jq -cn '[range(10)] | .[] |= select(. <= 3)' [0,1,2,3,5,7,9,null,null,null] > gojq -cn '[range(10)] | .[] |= select(. <= 3)' [0,1,2,3]