(v3.0.2) BATS output not handled properly

Question

(v3.0.2) BATS output not handled properly

znerd opened this issue 6 years ago · comments

Describe the bug
When BATS produces output with a fair amount of details, then tap-junit (v3.0.2) misinterprets that input.

To Reproduce
Steps to reproduce the behavior:

Have a file named unit-test-results.tap.txt with the following contents:

1..4
ok 1 dry run (DEBUG) - status
not ok 2 dry run (DEBUG) - output
# (from function `assert_equal' in file /opt/bats-helpers/bats-assert/src/assert.bash, line 91,
#  in test file /Users/johndoe/acmeinc/bblocks-publish/src/test/bats/009-dry-run-debug.bats, line 25)
#   `assert_equal "${lines[3]}" "DEBUG: CODEBASE_DIR=/code"' failed
# 
# -- values do not equal --
# expected : DEBUG: CODEBASE_DIR=/code
# actual   : DEBUG: CODEBASE_DIR=/Users/johndoe/acmeinc/bblocks-publish
# --
# 
not ok 3 dry run (DEBUG) - output length
# (from function `assert_equal' in file /opt/bats-helpers/bats-assert/src/assert.bash, line 91,
#  in test file /Users/johndoe/acmeinc/bblocks-publish/src/test/bats/009-dry-run-debug.bats, line 40)
#   `assert_equal "${#lines[@]}" 24' failed
# 
# -- values do not equal --
# expected : 24
# actual   : 16
# --
# 
not ok 4 dry run (DEBUG) - output ALL - TODO: Remove
# (from function `assert_equal' in file /opt/bats-helpers/bats-assert/src/assert.bash, line 91,
#  in test file /Users/johndoe/acmeinc/bblocks-publish/src/test/bats/009-dry-run-debug.bats, line 44)
#   `assert_equal "${output}" ""' failed
# 
# -- values do not equal --
# expected (0 lines):
# 
# actual (16 lines):
#   DEBUG: DEBUG=yes
#   DEBUG: DRY_RUN=yes
#   DEBUG: SKIP=no
#   DEBUG: CODEBASE_DIR=/Users/johndoe/acmeinc/bblocks-publish
#   DEBUG: BBLOCKSFILE=BBlocksfile
#   DEBUG: ARTIFACT_REPO_AUTH is set.
#   DEBUG: CURL_CMD=curl
#   DRY_RUN: curl --version
#   DEBUG: curl --version succeeded.
#   DEBUG: CURL_TIMEOUT_IN_SEC=120
#   DEBUG: CURL_CONNECT_TIMEOUT_IN_SEC=10
#   DEBUG: Publish repo URL is: https://artifactory-de.acmeinc.com/artifactory/libs-release-local/
#   DEBUG: Coordinates: com.acmeinc.bblocks :: bblocks-fetch :: 0-SNAPSHOT
#   Publishing script to: TODO
#   /Users/johndoe/acmeinc/bblocks-publish/target/bblocks-fetch: line 157: ${target_dir}/${self.coords.name}: bad substitution
#   DEBUG: cURL exit code: 1
# --
#

Pass this into tap-junit, e.g.:

cat unit-test-results.tap.txt | tap-junit > junit.xml

Expected behavior

On step 2, valid JUnit XML is produced.
The XML contains either 1 or 4 test suites (depending on how it maps the TAP tests).
The XML contains exactly 4 test cases.

Actual behavior

OK: On step 2, valid JUnit XML is produced.
NOK: The XML contains 45 test suites.
OK: The XML contains exactly 4 test cases.

Here is the produced JUnit XML:

<?xml version="1.0"?>
<testsuites tests="4" name="Tap-Junit" failures="3" errors="0">
  <testsuite tests="2" failures="1" errors="0" name="dry run (DEBUG) - status">
    <testcase name="#1 dry run (DEBUG) - status"/>
    <testcase name="#2 dry run (DEBUG) - output">
      <failure message="not ok 2 dry run (DEBUG) - output"/>
    </testcase>
  </testsuite>
  <testsuite tests="0" failures="0" errors="0" name="(from function `assert_equal' in file /opt/bats-helpers/bats-assert/src/assert.bash, line 91,"/>
  <testsuite tests="0" failures="0" errors="0" name="in test file /Users/johndoe/acmeinc/bblocks-publish/src/test/bats/009-dry-run-debug.bats, line 25)"/>
  <testsuite tests="0" failures="0" errors="0" name="`assert_equal &quot;${lines[3]}&quot; &quot;DEBUG: CODEBASE_DIR=/code&quot;' failed"/>
  <testsuite tests="0" failures="0" errors="0" name=" "/>
  <testsuite tests="0" failures="0" errors="0" name="-- values do not equal --"/>
  <testsuite tests="0" failures="0" errors="0" name="expected : DEBUG: CODEBASE_DIR=/code"/>
  <testsuite tests="0" failures="0" errors="0" name="actual   : DEBUG: CODEBASE_DIR=/Users/johndoe/acmeinc/bblocks-publish"/>
  <testsuite tests="0" failures="0" errors="0" name="--"/>
  <testsuite tests="1" failures="1" errors="0" name=" ">
    <testcase name="#3 dry run (DEBUG) - output length">
      <failure message="not ok 3 dry run (DEBUG) - output length"/>
    </testcase>
  </testsuite>
  <testsuite tests="0" failures="0" errors="0" name="(from function `assert_equal' in file /opt/bats-helpers/bats-assert/src/assert.bash, line 91,"/>
  <testsuite tests="0" failures="0" errors="0" name="in test file /Users/johndoe/acmeinc/bblocks-publish/src/test/bats/009-dry-run-debug.bats, line 40)"/>
  <testsuite tests="0" failures="0" errors="0" name="`assert_equal &quot;${#lines[@]}&quot; 24' failed"/>
  <testsuite tests="0" failures="0" errors="0" name=" "/>
  <testsuite tests="0" failures="0" errors="0" name="-- values do not equal --"/>
  <testsuite tests="0" failures="0" errors="0" name="expected : 24"/>
  <testsuite tests="0" failures="0" errors="0" name="actual   : 16"/>
  <testsuite tests="0" failures="0" errors="0" name="--"/>
  <testsuite tests="1" failures="1" errors="0" name=" ">
    <testcase name="#4 dry run (DEBUG) - output ALL - TODO: Remove">
      <failure message="not ok 4 dry run (DEBUG) - output ALL - TODO: Remove"/>
    </testcase>
  </testsuite>
  <testsuite tests="0" failures="0" errors="0" name="(from function `assert_equal' in file /opt/bats-helpers/bats-assert/src/assert.bash, line 91,"/>
  <testsuite tests="0" failures="0" errors="0" name="in test file /Users/johndoe/acmeinc/bblocks-publish/src/test/bats/009-dry-run-debug.bats, line 44)"/>
  <testsuite tests="0" failures="0" errors="0" name="`assert_equal &quot;${output}&quot; &quot;&quot;' failed"/>
  <testsuite tests="0" failures="0" errors="0" name=" "/>
  <testsuite tests="0" failures="0" errors="0" name="-- values do not equal --"/>
  <testsuite tests="0" failures="0" errors="0" name="expected (0 lines):"/>
  <testsuite tests="0" failures="0" errors="0" name=" "/>
  <testsuite tests="0" failures="0" errors="0" name="actual (16 lines):"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: DEBUG=yes"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: DRY_RUN=yes"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: SKIP=no"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: CODEBASE_DIR=/Users/johndoe/acmeinc/bblocks-publish"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: BBLOCKSFILE=BBlocksfile"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: ARTIFACT_REPO_AUTH is set."/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: CURL_CMD=curl"/>
  <testsuite tests="0" failures="0" errors="0" name="DRY_RUN: curl --version"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: curl --version succeeded."/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: CURL_TIMEOUT_IN_SEC=120"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: CURL_CONNECT_TIMEOUT_IN_SEC=10"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: Publish repo URL is: https://artifactory-de.acmeinc.com/artifactory/libs-release-local/"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: Coordinates: com.acmeinc.bblocks :: bblocks-fetch :: 0-SNAPSHOT"/>
  <testsuite tests="0" failures="0" errors="0" name="Publishing script to: TODO"/>
  <testsuite tests="0" failures="0" errors="0" name="/Users/johndoe/acmeinc/bblocks-publish/target/bblocks-fetch: line 157: ${target_dir}/${self.coords.name}: bad substitution"/>
  <testsuite tests="0" failures="0" errors="0" name="DEBUG: cURL exit code: 1"/>
  <testsuite tests="0" failures="0" errors="0" name="--"/>
  <testsuite tests="0" failures="0" errors="0" name=" "/>
</testsuites>

File references:

Dustin Hershman · Answer 1 · Mon Feb 18 2019 23:18:12 GMT+0800 (China Standard Time)

Damn! okay new test case inbound.

EDIT:

So what I am picking up is that the report only feeds back 4 tests but your actual tests contain 45 in total, the rest of which are not being counted?

EDIT 2:

it looks like the count is based off asserts (which there are 4 asserts) and then there is a tests count of 37 which I am not sure what those are being tracked as other than type: 'test'

Ernst de Haan · Answer 2 · Tue Feb 19 2019 00:46:09 GMT+0800 (China Standard Time)

@dhershman1 You wrote:

So what I am picking up is that the report only feeds back 4 tests but your actual tests contain 45 in total, the rest of which are not being counted?

…but I am not sure what you mean. I have 4 tests in total, and the results are in the BATS output.

If you want I can produce an original .bats file that reproduces the issue?

Dustin Hershman · Answer 3 · Tue Feb 19 2019 02:07:06 GMT+0800 (China Standard Time)

I misread your original report. If you want to share the original test file that would possibly be beneficial too!

Dustin Hershman · Answer 4 · Thu Feb 21 2019 23:02:31 GMT+0800 (China Standard Time)

Some further research into the matter, It's going to be kind of hard to plan it out fully it seems, I might need to take a look at your original tests that generated that output if you still have them! Just to run a few scenarios to see a pattern of some kind.

Ernst de Haan · Answer 5 · Fri Feb 22 2019 16:03:41 GMT+0800 (China Standard Time)

Here's a better reproduction case:
tap-junit-issue-23.bats.txt

Steps to reproduce:

Rename tap-junit-issue-23.bats.txt to tap-junit-issue-23.bats.
Run bats 1.0.0 on the bats script, e.g.:

docker run -v $(pwd):/tests -i dduportal/bats:1.0.0 . | tee bats-1.0.0.output.txt

Run bats 1.1.0 on the bats script, e.g.:

docker run -v $(pwd):/tests -i mindcurv/bats:1.1.0 . | tee bats-1.1.0.output.txt

Run tap-junit on the bats-1.0.0.output.txt file.
Run tap-junit on the bats-1.1.0.output.txt file.

Actual output:

On step 2: bats-1.0.0.output.txt
On step 3: bats-1.1.0.output.txt

Dustin Hershman · Answer 6 · Fri Mar 08 2019 01:31:17 GMT+0800 (China Standard Time)

Okay sorry for the delay on status (Work has picked up quite a lot).

I've identified the issue is for sure tap-out here. As it looks at the tap output and assumes everything with a # sign in front of it is a new test. Which is obviously in correct here. As in proper tap output this is used for comments and directives.

I believe this fault also lies in tape which tap-out is built around. Tape takes a test title and adds a comment in about the test. For example in my unit testing I have this:

test('1 === 1', t => {
  t.plan(3)
  t.equal(1, 1, 'test is equal', {data: 'cool'})
  t.equal(1, 1, 'test skip extra', {skip: true})
  t.notEqual(1, 0)
  t.end()
})

Now when you run that through tape, the tap output comes out like so:

# 1 === 1
ok 1 test is equal
ok 2 test skip extra # SKIP
ok 3 should not be equal

We see here that it is placing my "title" as # 1 === 1 in the output.

This also falls down to when tape specifies a comment it wraps it with --- characters. Like an error for instance.

So this problem lies in the modules tap-junit relies on for it's output. Since the title can be literally anything, it's really hard to determine what is an actual "Title" and what is a comment.

EDIT 1:
So, I am able to do slight tweaking in the assert counter, I can filter out the comments from the output using that. HOWEVER this means I am unable to collect comments for the output either. So if that's required this may not be the way to go. I am still looking into solutions however. Will report back.

Dustin Hershman · Answer 7 · Tue Mar 12 2019 22:12:55 GMT+0800 (China Standard Time)

I have a solution in place, but the names get a little thrown off.

This is the quickest go to way to alleviate the primary issue. The long term solution might be to do the parsing myself which is on the table for the future. But to get this up and working again I think the solution for now is to measure the asserts...

I would love to get your thoughts on it @znerd

Ernst de Haan · Answer 8 · Wed Mar 13 2019 00:48:27 GMT+0800 (China Standard Time)

[…] to get this up and working again I think the solution for now is to measure the asserts...

I'm not sure what you are proposing?

From my POV:

if the problem is in the underlying library, then perhaps we should raise a bug report with them;
if the underlying standard (TAP) is unclear, then we could raise an RFE with them;
if the output from BATS is not TAP-compliant, then we should raise an RFE with them.

Dustin Hershman · Answer 9 · Wed Mar 13 2019 02:19:37 GMT+0800 (China Standard Time)

Okay I went over to tap-out to open an issue there to see where this might fall, or if there's something I can look out for.

Pixcell · Answer 10 · Thu Aug 08 2019 08:18:37 GMT+0800 (China Standard Time)

Hi,

I am trying to use tap-junit to convert BATS tap output to JUnit xml. I am having exactly the same issue than @znerd .

This is the tap output (results.tap):

1..18
ok 1 Help
ok 2 Help w/o flag
ok 3 create Help
ok 4 delete Help
ok 5 deploy Help
ok 6 describe Help
ok 7 connect Help
ok 8 disconnect Help
ok 9 legacy Help
ok 10 logs Help
ok 11 get Help
ok 12 version
not ok 13 Get All
# (from function `test' in file test/functions.bash, line 4,
#  in test file test/smoke.bats, line 54)
#   `test iofogctl get all' failed
# NAMESPACE
# default
# 
# CONTROLLER	STATUS		AGE		UPTIME		IP		PORT		
# local-ecn	Failing		-		-		0.0.0.0		51121		
# 
# AGENT		STATUS		AGE		UPTIME		IP		VERSION		
# ioFog Agent	offline		-		-		0.0.0.0:54321	-		
# 
# �[38;5;1m✘ Post http://0.0.0.0:51121/api/v3/user/login: dial tcp 0.0.0.0:51121: connect: connection refused�[0m
ok 14 Get Namespaces
ok 15 Get Controllers
ok 16 Get Agents
ok 17 create namespace
ok 18 delete namespace

This is the output of tap-junit -s smoke -n results-smoke.xml -o ./ -i results.tap

<?xml version="1.0"?>
<testsuites tests="18" name="smoke" failures="1" errors="0">
  <testsuite tests="13" failures="1" errors="0" name="Help">
    <testcase name="#1 Help"/>
    <testcase name="#2 Help w/o flag"/>
    <testcase name="#3 create Help"/>
    <testcase name="#4 delete Help"/>
    <testcase name="#5 deploy Help"/>
    <testcase name="#6 describe Help"/>
    <testcase name="#7 connect Help"/>
    <testcase name="#8 disconnect Help"/>
    <testcase name="#9 legacy Help"/>
    <testcase name="#10 logs Help"/>
    <testcase name="#11 get Help"/>
    <testcase name="#12 version"/>
    <testcase name="#13 Get All">
      <failure message="not ok 13 Get All"/>
    </testcase>
  </testsuite>
  <testsuite tests="0" failures="0" errors="0" name="(from function `test' in file test/functions.bash, line 4,"/>
  <testsuite tests="0" failures="0" errors="0" name="in test file test/smoke.bats, line 54)"/>
  <testsuite tests="0" failures="0" errors="0" name="`test iofogctl get all' failed"/>
  <testsuite tests="0" failures="0" errors="0" name="NAMESPACE"/>
  <testsuite tests="0" failures="0" errors="0" name="default"/>
  <testsuite tests="0" failures="0" errors="0" name=" "/>
  <testsuite tests="0" failures="0" errors="0" name="CONTROLLER&#x9;STATUS&#x9;&#x9;AGE&#x9;&#x9;UPTIME&#x9;&#x9;IP&#x9;&#x9;PORT&#x9;&#x9;"/>
  <testsuite tests="0" failures="0" errors="0" name="local-ecn&#x9;Failing&#x9;&#x9;-&#x9;&#x9;-&#x9;&#x9;0.0.0.0&#x9;&#x9;51121&#x9;&#x9;"/>
  <testsuite tests="0" failures="0" errors="0" name=" "/>
  <testsuite tests="0" failures="0" errors="0" name="AGENT&#x9;&#x9;STATUS&#x9;&#x9;AGE&#x9;&#x9;UPTIME&#x9;&#x9;IP&#x9;&#x9;VERSION&#x9;&#x9;"/>
  <testsuite tests="0" failures="0" errors="0" name="ioFog Agent&#x9;offline&#x9;&#x9;-&#x9;&#x9;-&#x9;&#x9;0.0.0.0:54321&#x9;-&#x9;&#x9;"/>
  <testsuite tests="0" failures="0" errors="0" name=" "/>
  <testsuite tests="5" failures="0" errors="0" name="�[38;5;1m✘ Post http://0.0.0.0:51121/api/v3/user/login: dial tcp 0.0.0.0:51121: connect: connection refused�[0m">
    <testcase name="#14 Get Namespaces"/>
    <testcase name="#15 Get Controllers"/>
    <testcase name="#16 Get Agents"/>
    <testcase name="#17 create namespace"/>
    <testcase name="#18 delete namespace"/>
  </testsuite>
</testsuites>

It is indeed handling directives (starting with # as new test cases instead of test output)
Any news on making any progress towards fixing this ?

Thank you very much,

Pixcell · Answer 11 · Thu Aug 08 2019 08:32:00 GMT+0800 (China Standard Time)

Okay I went over to tap-out to open an issue there to see where this might fall, or if there's something I can look out for.

@dhershman1 Could we have the link of the tap-out issue ? Thanks !

Dustin Hershman · Answer 12 · Thu Aug 08 2019 21:40:47 GMT+0800 (China Standard Time)

@Pixcell I know you found it but for future reference and anyone in the future here is the issue link: scottcorgan/tap-out#31

At this point I cannot do anything on my end because of the way the parser actually parses out the output and gives it to tap-junit.

Either this needs addressed on their end or tap-junit would need to run it's own parser. Which I am not sure I can muster the time to do right at this moment. (The motivation to do so is sort of low since that's a hefty task and wasn't the original intention of tap-junit)

Pixcell · Answer 13 · Fri Aug 09 2019 04:06:34 GMT+0800 (China Standard Time)

@dhershman1 I understand, no problem :)

I had a look at tap-out code, looks like a simple regex issue. But I don't know enough about TAP format to be able to fix this regex (For some reason, their test line regex contains #)

Dustin Hershman · Answer 14 · Fri Aug 09 2019 04:14:45 GMT+0800 (China Standard Time)

@Pixcell Yeah, that is what I discovered as well. It's because Tape (A tap testing suite) uses that for test lines and it uses -- for comment lines I believe. So it's also a difference between testing suite formats I believe.

Pixcell · Answer 15 · Fri Aug 09 2019 04:17:58 GMT+0800 (China Standard Time)

@dhershman1 That's a shame

Looks like Tape is not using a proper TAP format then ...

https://testanything.org/tap-specification.html
https://testanything.org/tap-version-13-specification.html

Dustin Hershman · Answer 16 · Fri Aug 09 2019 04:26:59 GMT+0800 (China Standard Time)

@Pixcell correct, I do not think it is an accurate output

$ node example/timing.js
TAP version 13
# timing test
ok 1 should be equal
not ok 2 should be equal
  ---
    operator: equal
    expected: 100
    actual:   107
  ...

1..2
# tests 2
# pass  1
# fail  1

Is an example output from Tape.

Pixcell · Answer 17 · Fri Aug 09 2019 04:39:34 GMT+0800 (China Standard Time)

@dhershman1 They must be following TAP13

https://testanything.org/tap-version-13-specification.html

--- is used to delimit YAML output

Pixcell · Answer 18 · Fri Aug 09 2019 04:45:00 GMT+0800 (China Standard Time)

Actually, my bad, I will edit the previous comment as # directives are not meant to be separate lines as of TAP13. They are meant to be added to a test line ok test xx # SKIP

So, that leaves # starting lines as comments, and should be ignored by the parser
Any additional information should be given as YAML (According to TAP13).

Which raises another issue, BATS is not using TAP13, but TAP specification.

Looks like TAP parsers are in for a treat. They need to separate TAP13 from TAP

Dustin Hershman · Answer 19 · Fri Aug 09 2019 04:49:51 GMT+0800 (China Standard Time)

@Pixcell Hmmm Interesting. Yeah that sounds like it'll be a lot of fun...

Dustin Hershman · Answer 20 · Tue Nov 24 2020 05:15:07 GMT+0800 (China Standard Time)

I dunno how much it matters now but this will be fixed (finally) in v4.0.0 since I am completely re writing the tool to use a new internal parser.

Soon ™️

Dustin Hershman · Answer 21 · Wed Nov 25 2020 04:29:29 GMT+0800 (China Standard Time)

Should be addressed with the release of v4.0.0