sethgrid / pester

Go (golang) http calls with retries and backoff

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

tests are failing

mariash opened this issue · comments

go test fails with and without -race.

$ go test
--- FAIL: TestConcurrent2Retry0 (0.00s)
    main_test.go:56:
         1461345707 Get [GET] http://localhost:9000/foo request-1 retry-2 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused

    main_test.go:59: got 1 attempts, want 2
--- FAIL: TestConcurrentRequests (2.01s)
    main_test.go:33:
         1461345707 Get [GET] http://localhost:9000/foo request-0 retry-2 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345707 Get [GET] http://localhost:9000/foo request-2 retry-2 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345707 Get [GET] http://localhost:9000/foo request-1 retry-2 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345708 Get [GET] http://localhost:9000/foo request-1 retry-3 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345708 Get [GET] http://localhost:9000/foo request-0 retry-3 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345708 Get [GET] http://localhost:9000/foo request-2 retry-3 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345709 Get [GET] http://localhost:9000/foo request-1 retry-4 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused

    main_test.go:36: got 7 attempts, want 9
FAIL
exit status 1
FAIL    github.com/sethgrid/pester  16.338s
$ go test -race
--- FAIL: TestConcurrent2Retry0 (0.00s)
    main_test.go:56:
         1461345743 Get [GET] http://localhost:9000/foo request-1 retry-2 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused

    main_test.go:59: got 1 attempts, want 2
--- FAIL: TestConcurrentRequests (2.01s)
    main_test.go:33:
         1461345743 Get [GET] http://localhost:9000/foo request-1 retry-2 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345743 Get [GET] http://localhost:9000/foo request-2 retry-2 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345743 Get [GET] http://localhost:9000/foo request-0 retry-2 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345744 Get [GET] http://localhost:9000/foo request-0 retry-3 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345744 Get [GET] http://localhost:9000/foo request-1 retry-3 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345744 Get [GET] http://localhost:9000/foo request-2 retry-3 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345745 Get [GET] http://localhost:9000/foo request-0 retry-4 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused
        1461345745 Get [GET] http://localhost:9000/foo request-1 retry-4 error: Get http://localhost:9000/foo: dial tcp [::1]:9000: getsockopt: connection refused

    main_test.go:36: got 8 attempts, want 9
FAIL
exit status 1
FAIL    github.com/sethgrid/pester  16.351s

Tests pass locally with and without the race flag, but sometimes it does fail with the same error you are seeing: dial tcp [::1]:9000: getsockopt: connection refused. It seems to go in waves. I was able to get the error less often by lowering the concurrency settings in the tests. My first thought is that it is opening too many connections in the tests. While that should not be the case, I think it may be. You can try changing your ulimit (which works more reliably in my experience in linux than on mac). I'll see if I can't figure out why tests are intermittently failing.

Thanks for the report; I'll keep investigating.

Ah ha! Here is the fix.
The problem was that the previous optimizations were causing late writes to the error log, and the error log is used in tests to verify that the right amount of responses came back. This was inherently racy in that we could have responses that have not yet come back, but the tests were asserting that they should have by then.

The fix is simple: provide a way for pester to say "I still have requests from which I have not gotten responses." The tests now make use of pester.Client.Wait() to make sure all responses come back prior to asserting log length.

Thanks for reporting the issue!