lestrrat-go / httprc

Quasi Up-to-date HTTP In-memory Cache

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

lastFetch is set even if fetch fails

marc-ste opened this issue · comments

In queue.go

// synchronously go fetch
e.lastFetch = time.Now()
res, err := q.fetch.fetch(ctx, e.request)
if err != nil {
    // Even if the request failed, we need to queue the next fetch
    q.enqueueNextFetch(nil, e)
    return fmt.Errorf(`failed to fetch %q: %w`, e.request.url, err)
}

If e.lastFetch is written before the fetch method finishes, it will be set regardless of the success of fetch(). This means that subsequent fetches don't try and fetch the object even if entry.data is nil.

@marc-ste It has been a while since I last checked this code, so please forgive me for sounding a bit awkward.

If I'm not mistaken, lastFetch records the last fetch attempt, regardless of the result of the fetch.
cache.go doesn't automatically perform a fetch repeatedly on URLs that failed, because... it's wasteful. But you can force it to fetch using Refresh().

queue.go doesn't seem to have a logic that will stop processing depending on the value of lastFetch, so I don't think it is affected by whatever the value of lastFetch is.

At least that's my take, albeit there could always be oversights. If you are encountering a problem, can you please provide some code?

If I'm not mistaken, lastFetch records the last fetch attempt, regardless of the result of the fetch.

Maybe thats the literal interpretation of it, but I think its misleading. See below.

cache.go doesn't automatically perform a fetch repeatedly on URLs that failed, because... it's wasteful. But you can force it to fetch using Refresh().

I can understand why we don't want to repeatedly fetch, but marking it as fetched and populating that entry with nil doesn't feel appropriate.
I think the issue is here: https://github.com/lestrrat-go/httprc/blob/main/cache.go#L155

if forceRefresh || !e.hasBeenFetched() {
    if err := c.queue.fetchAndStore(ctx, e); err != nil {
        e.releaseSem()
	return nil, fmt.Errorf(`failed to fetch %q: %w`, u, err)
    }
}

Even If the last fetch failed, e.hasBeenFetched() will resolve to true and the cache will return a nil.

Even If the last fetch failed, e.hasBeenFetched() will resolve to true and the cache will return a nil.

I'm sorry if I sound obtuse, but I don't understand.
Returning nil when the previous fetch failed is the intended behavior, because we wouldn't know if the resource is, for example, actually 404 (or maybe it's HTML instead of JSON) for good, or missing/malformed temporarily and is something we can recover. Only the caller knows if the URL should be retried, therefore we won't try to fetch repeatedly.

There might be cases that we missed which is causing you problems, i.e. our "intended behavior" is wrong, or maybe we have misleading documentation.
But while you have described the internal code that you deem wrong, you have yet to actually show me any code or explanation as to why this matters, so ATM I don't see the problem.