jaemk / cached

Rust cache structures and easy function memoization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request: Ignore certain function arguments

phayes opened this issue · comments

I've got a function I'd like to memoize that looks like this:

fn do_expensive_db_query(db: &DbConnection, foo: &str) -> Bar {
   db.query("SELECT ...").args(foo).execute()
}

I'd like to be able to ignore the db argument, since it just contains a live connection to the database and has no impact on the memoized value. It also has a weird lifetime since it's Clone, but is tied back to a connection-pool.

Hey @phayes, you can ignore arguments by specifying a convert block which generates the cache key from your arguments. Here's an example

#[cached(key = "String", convert = r#"{ format!("{}{}", a, b) }"#)]
and there's some more in here https://docs.rs/cached/latest/cached/proc_macro/index.html . When you specify a convert block, you also need to specify either the key which is the type that will be set as the cache key type or specify the type which is the full cache type (This is because the macro can no longer infer the cache key type from your function arguments). So for your function, you might do something like

#[cached(
    type = "SizedCache<String, Bar>",
    create = "{ SizedCache::with_size(20) }",
    convert = r#"{ String::from(foo) }"#
)]
fn do_expensive_db_query(db: &DbConnection, foo: &str) -> Bar {
   db.query("SELECT ...").args(foo).execute()
}

Thanks @jaemk! I appreciate the detailed answer.

Will the generated wrapper do the right thing with db: &DbConnection and just drop it on the ground when writing the cache or reading with a cache-hit?

No problem!

The caller still passes the db connection reference, but it's not included in the value that is cached and it's not used to determine if there is a cached value. The only time it's used is when there's a cache miss and the body of your function needs to be executed to produce a value that will be returned and cloned to the cache.

Thanks @jaemk ,

I ask the question because I'm noticing something where when caching functions that take a db: &DbConnection, my database connections eventually get stalled and I get errors related to the database connection pool running out of connections.

My entire application is async, and my Cargo.toml looks like this:

cached = { version = "0.40", features = ["async", "async_tokio_rt_multi_thread"] }

Is there anything in particular we need to do to use the async side of the cached crate correctly?

That's interesting. Have you confirmed that it only happens when a function is #[cached]? The core of the macro is right here

async fn inner(#inputs) #output #body;
let result = inner(#(#input_names),*).await;
let mut cache = #cache_ident.lock().await;
#set_cache_block
result

There are no modifications made to the function contents you are wrapping. It's simply copied to the "inner" function and then called when a cached value isn't found. The above chunk is referenced here as the #do_set_return_block

#visibility #signature_no_muts {
use cached::Cached;
let key = #key_convert_block;
{
// check if the result is cached
let mut cache = #cache_ident.lock().await;
if let Some(result) = cache.cache_get(&key) {
#return_cache_block
}
}
#do_set_return_block
}

My only thought is that it's possible the additional wait points and potential lock contention introduced from the global cache may mean that connections are being held onto longer than necessary when many of this one task are running.

It may help if you pass a connection pool to this function instead of a connection instance itself. Then a connection can be pulled, used, and returned only when your code needs to be executed instead of every time the cached function is called.

Thanks @jaemk ,

The issue is frustratingly non-deterministic. I suspect it may have something to do with lock contention.

Thank you for taking the time to explain all this to me, it's very helpful and deeply appreciated.

Let's close this issue for now, I'll open a new one if I manage to get to the bottom of it in a meaningful / reproducible way.

Hope you can figure it out! I'd definitely try passing in a connection pool if it's possible