jaemk / cached

Rust cache structures and easy function memoization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cache an async function without calling `.await`

absoludity opened this issue · comments

A question (sorry for any confusion herein) : is there a pattern to be able to cache an async fn() without calling .await if the result is already cached?

That is, the normal pattern:

/// should only sleep the first time it's called
#[cached]
async fn cached_sleep_secs(secs: u64) {
sleep(Duration::from_secs(secs)).await;
}

which is then called with

cached_sleep_secs(1).await;

. So even just checking the in-memory cache and returning the cached result - a non-async operation - is handled as an async.

Why is this an issue? AFAICT, when used with tokio, the call to .await is correctly handled as an i/o event and so the task is slept, in this case unnecessarily. With lots of requests, this is leading to CPU starvation. What I'd like to do is something like manually:

let res = match cache.has(key) {
    true => cache.get(key),
    false => decorated_fn().await
};
// continue with rest of fn

but I'm not sure how I can access the cache global (or maybe need to use a non-proc macro?).

Unfortunately if you want to manipulate a cache without async operations while populating it from an async operation like your last example, then you have to do so manually without any macros.

In order for the macros to preserve the function signature of the functions being cached, all of the code (explicit and generated) has to live inside of the async function definition. Additionally, cached async functions will use async synchronization types (async-mutex, async-rw-lock) which will introduce additional wait points, but is necessary for the locking to function cooperatively with the async control flow.

So to your question - if you want to operate on the cache without async locks, you can, but you need to do so by manually defining your cache and wrapping it in a non-async synchronization type (std Mutex/RwLock). You can then write something like your last example above (with the addition of calls to .lock/read/write to synchronize container access). This will work, however the fact that you are then calling an async function means that this code must exist in an async context, which means that usage of the non-async synchronization types will essentially be equivalent to "blocking io" when there is any lock contention since waiting for the lock will not be a concurrent operation.

And your last question: by default (you can specify otherwise to the macro) the global cache is defined as the function name in all caps. But, keep in mind that global cache needs to be synchronized and will be wrapped in a synchronization type. The synchronization type will be sync/async depending on the async-ness of the function.

OK, thanks James for the thought out response to my vague question - that helps a lot. Having re-read that a few times, I think, whether or not it's possible, what I seem to be trying for or wanting is a cache wrapped in std sync for reads but async for writes. Though I'm not sure it'd even fix my issue (rather than instead increase the blocking io).

Thanks again.

Additionally, cached async functions will use async synchronization types (async-mutex, async-rw-lock) which will introduce additional wait points, but is necessary for the locking to function cooperatively with the async control flow.

Actually, I have another question about this, if you have time (sorry): it's not clear to me why we'd necessarily need an async rw-lock here, unless we're needing to call .await while holding the lock - which we don't necessarily. Isn't the write-lock only required while writing back the value to the cache.

Unless you mean that because cached must use the same function signature, it is therefore forced to be calling .await inside the wrapping RwLock? If that's the only reason, then a manual cache with a sync RwLock should work fine, I think? (Re-reading your comment, that's what you'd actually said - so yes, perhaps I'll try that).

then a manual cache with a sync RwLock should work fine

Yes, that's correct, but as you mentioned you need to be careful not to call any async methods while holding the lock since the executor may "yield" to another task that is trying to acquire the same lock while the lock is still held

FWIW, it does indeed halve the query time for my 50 requests by using a standard sync read-only lock on the cache. Unfortunately though, because the Cached trait requires a mutable reference for for get:

    fn cache_get(&mut self, k: &K) -> Option<&V>;

which, AFAICT, is just for the hit/miss counts, you can't call .get with a read-only lock, so I'm currently having to us my own hash.

Let me know if you think that's an issue (that a mutable ref is required for get) and I'll create an issue and try to get you a PR for it. Having a mutable ref there also stops you from using other patterns for the cache, such as promoting new state rather than mutating existing state (which is the way I'm heading in my issue, for simplicity).

That's an interesting pattern. I'm not sure we can replace everything wholesale with the promoting version though since there's some things like hit/miss counts, certain cache types (lru), and synchronized-writes that rely on mutating the shared value. I would be open to introducing a new macro option though that generates code following this pattern for constrained cases, e.g. any type that could support non-mut gets with the caveat that hit/miss counters will not function and write heavy logic could result in very high memory consumption.

Yeah - the promoting new state pattern was just a side-note, in that I'd be able to use the existing ExpiringCredentialCache struct with that pattern in my own code, if cache_get didn't require &mut self. I simply tried to update that type to use &self instead, but it then can no longer implement the Cached trait, I think.

I think the move here is to create a separate trait CachedNotMutGet (maybe there's a better name) and implement that for any type that can support it to indicate the ones that could be used with a promotion pattern