jqlang / jq

Command-line JSON processor

Home Page:https://jqlang.github.io/jq/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`|= empty` does not work as expected in v1.6

kpym opened this issue · comments

Describe the bug
Following the documentation |= empty is supposed to delete the element.
But when multiple elements are set to empty it does not work properly.

To Reproduce
query: (.[] | select(. >= 2)) |= empty
input: [1,5,3,0,7]
output: [1,3,7]

Expected behavior
output: [1,0]

Environment:
jqplay

Note

  • If you replace >=2 with >=7 then 7 is deleted.
  • If you replace >=2 with >=5 then only 5 is deleted.
$ jq-1.6 -cn '[1,5,3,0,7]|(.[]) |= empty'
[5,0]
$ jq-1.6 -cn '[1,5,3,0,7]|(.[range(length - 1; -1; -1)]) |= empty'
[]
$ jq-1.6 -cn '[1,5,3,0,7]|(.[range(length - 1; -1; -1)]|select(.>=2)) |= empty'
[1,0]
$ 

Wow, yeah, this is surprising. |= empty works for objects, but for arrays... it's broken, and the reason its broken is that the paths being modified are computed on a copy of the array prior to the edits, while the edits (removals) remove items from a working copy, which changes the valid indices. Iterating backwards works around the problem.

I'm not sure that there's any way to fix this.

I have no idea of the code, but yes deleting elements from array inside a loop should always be done backwards !

There is another possibility : attribute the special value empty and raise a flag, and then if the flag is raised apply del to all empty elements. By the way del works witout problems on filtered arrays.

Ah, so here's the notional fix: when in the context of path(exp), EACH and EACH_OPT need to iterate arrays backwards. And they can know that they are being executed in the context of path/1.

diff --git a/src/execute.c b/src/execute.c
index e30ee01..39d6e78 100644
--- a/src/execute.c
+++ b/src/execute.c
@@ -865,8 +865,8 @@ jv jq_next(jq_state *jq) {
         keep_going = idx < len;
         is_last = idx == len - 1;
         if (keep_going) {
-          key = jv_number(idx);
-          value = jv_array_get(jv_copy(container), idx);
+          key = jv_number(jq->subexp_nest ? idx : len - (idx + 1));
+          value = jv_array_get(jv_copy(container), jq->subexp_nest ? idx : len - (idx + 1));
         }
       } else if (jv_get_kind(container) == JV_KIND_OBJECT) {
         if (opcode == EACH || opcode == EACH_OPT) idx = jv_object_iter(container);

fixes it:

$ ./jq -cn '[[1,5,3,0,7]|path(.[])]'
[[4],[3],[2],[1],[0]]
$ ./jq -cn '[1,5,3,0,7]|(.[] | select(. >= 2)) |= empty'
[1,0]
$ 

ah, but it's not right:

$ ./jq -cn '[1,5,3,0,7]|.[]'
7
0
3
5
1

The problem is that jq->subexp_nest == 0 if we're either not at all in path/1 context, or we're in path/1 context but not in a "subexp". We need a bit more state.

How about delaying delpaths?

def _modify(ps; f):
  reduce path(ps) as $p
    ([., []]; label $out | (setpath([0] + $p; getpath([0] + $p) | f) | ., break $out), .[1] += [$p])
      | . as $x | $x[0] | delpaths($x[1]);

I have some unexpected behaviour on using select to the right hand side of a |= which might be another example of the same issue but I'm not quite sure. Adapted from the example in the OP I have:

Input: [1,5,3,0,7]
Filter: .[] |= select(. >= 2)
Output: [5,3,7] (as expected)

but If I change the select to the following: (https://jqplay.org/s/_6xd9PGoXN)

Input: [1,5,3,0,7]
Filter: .[] |= select(. == 2)
Output: [5,0]

Is this an instance of the same issue or something different?

@amagee That's surely an instance of the same issue.

Additionally, [range(10)] | .[] |= select(. <= 3) will skip checking the next value after one evaluates to empty and will add a null to the end of the array for every "emptied" element.

$ jq -cn '[range(10)] | .[] |= select(. <= 3)'
[0,1,2,3,5,7,9,null,null,null]
$ jq -cn '[range(10)] | .[] |= if . <= 3 then . else empty end'
[0,1,2,3,5,7,9,null,null,null]

So it does not resize the array and it shifts the elements by one after emptying an element resulting in the element after an emptied element being skipped.

[range(10)] | .[] |= empty has a similar shifting behaviour, but the "emptied" elements are actually removed from the array instead of turning into nulls at the end of the array:

$ jq -cn '[range(10)] | .[] |= empty'
[1,3,5,7,9]

Just to mention that gojq do not have this issue.

  •   > jq -cn '[1,5,3,0,7]|(.[]) |= empty'
      [5,0]
      > gojq -cn '[1,5,3,0,7]|(.[]) |= empty'
      []
  •   > jq -cn '[range(10)] |.[] |= select(. == 2)'
      [1,2,4,6,8]
      > gojq -cn '[range(10)] |.[] |= select(. == 2)'
      [2]
  •   >jq -cn '[range(10)] | .[] |= select(. <= 3)'
      [0,1,2,3,5,7,9,null,null,null]
      > gojq -cn '[range(10)] | .[] |= select(. <= 3)'
      [0,1,2,3]