Usage of ArcSwapWeak

Question

Usage of ArcSwapWeak

KaiserKarel opened this issue 4 years ago · comments

I'm a bit unclear on the usage of ArcSwapWeak (and the docs might be a bit brief on the subject).

I've got shared configuration inside an ArcSwap; and a background routine which updates this configuration every X seconds. If no one holds a reference to the configuration, the background routine should end. Thus the background routine gets an ArcSwapWeak, and if the internal Weak returns None, the routine end.

I am a bit unclear on how to store the updated values. Currently I am assigning every field to the old stored object, but I am unsure if that is correct. I think this is a very common usecase, so I'd be willing to add an example to the docs with some guidance. :)

pub async fn initialized(cfg: Config) -> Result<Service, HttpError> {        
        let set = JwkSet::from_url(&cfg.jwk_url).await?;
        let handle = Arc::new(set); // actual users of jwkset obtain this handle; 
        let weak = ArcSwapWeak::new(Arc::downgrade(&handle)); // meant for the background routine
        let url = cfg.jwk_url.clone();
        
        // this routine should remain running as long as the weak handle is being used, else 
        // it should stop on the next iteration (no async destructors, else it could be awoken earlier)
        tokio::spawn(async move {
            loop {
                let jwks = &*weak.load();
                if let Some(jwks) = jwks.upgrade() {
                    if let Some(duration) = jwks.valid_for() {
                        tokio::time::delay_for(duration).await;
                    } else {
                        let new = JwkSet::from_url(&url).await.unwrap();
                        // store the new inside the handle somehow?
                    }
                // If the Weak no longer points to an actual Arc, the service has been dropped and
                // this routine can end.
                } else {
                    break
                }
            }
        });
        Ok(Service::new(cfg, ArcSwap::new(handle)))
}

Since JwkSet can be quite large, I'd prefer not to use an ArcSwap, so that the actual data may be dropped ASAP.

My current implementation just uses an ArcSwap for the background routine, but this means that the JwkSet remains allocated until the routine checks the count I believe.

pub async fn initialized(cfg: Config) -> Result<Service, HttpError> {
        let set = Arc::new((JwkSet::from_url(&cfg.jwk_url).await?));

        let handle = ArcSwap::new(Arc::clone(&set));
        let url = cfg.jwk_url.clone();

        {
            let handle = handle.clone();
            tokio::spawn(async move {
                loop {
                    let jwks = &*handle.load();
                    // If the reference count has dropped to 1, the background task is the only current
                    // reference, and thus there is no need to keep updating.
                    if Arc::strong_count(&jwks) == 1 {
                        break;
                    }

                    if let Some(duration) = jwks.valid_for() {
                        tokio::time::delay_for(duration).await;
                    } else {
                        let set = JwkSet::from_url(&url).await.unwrap();
                        handle.swap(Arc::new(set));
                    }

                }
            });
        }

        Ok(Service::new(cfg, handle))
    }

Michal 'vorner' Vaner · Answer 1 · Fri Nov 06 2020 02:25:05 GMT+0800 (China Standard Time)

I believe that your confusion is more related to deeper misunderstanding how ArcSwap works. In both cases, you create two independent ArcSwaps, each initially pointing to the same thing, but eventually diverging on the first swap. That is, that both point to the same thing is a coincidence, they are not tied to each other and update of one won't propagate to the other.

And indeed, assigning to separate fields is not the way you should be doing it it.

What you want to hand out to the rest of the application and the background task, is Arc<ArcSwap<Config>> and Weak<ArcSwap<Config>>. Then they have the same, shared, ArcSwap, so changing its value from one thread will propagate to the other.

As you're not the first one to get bitten by .clone() on ArcSwap, I'm starting to wonder if I should simply remove that trait implementation altogether.

https://docs.rs/arc-swap/1.0.0-rc1/arc_swap/docs/limitations/index.html#cloning-behaviour

Karel L. Kubat · Answer 2 · Fri Nov 06 2020 03:42:06 GMT+0800 (China Standard Time)

Ah, so basically ArcSwap<Arc<T>> == RwLock<T>, and Arc<ArcSwap> == Arc<RwLock>?

Michal 'vorner' Vaner · Answer 3 · Fri Nov 06 2020 03:48:39 GMT+0800 (China Standard Time)

More or less. It's more like ArcSwap<T> == RwLock<Arc<T>>, but that's the general idea.

Michal 'vorner' Vaner · Answer 4 · Fri Nov 06 2020 17:39:21 GMT+0800 (China Standard Time)

I believe this can be closed (I'll consider removing the Clone impl, but I don't think there's any specific action in this issue)

Karel L. Kubat · Answer 5 · Fri Nov 06 2020 18:48:02 GMT+0800 (China Standard Time)

Yeah, this can be closed, it cleared up a lot for me. Perhaps it might be better to remove the Clone implementation, as it does not have the same semantics as Arc::clone, which might confuse users. What are the usecases for the Clone impl as it stands?

Michal 'vorner' Vaner · Answer 6 · Fri Nov 06 2020 20:11:09 GMT+0800 (China Standard Time)

Honestly, I'm just used to implementing Clone (and Debug and some more) at any time when it's possible and there's no very good reason not to. Types that are not Clone are just pain to work with.

But people being confused by it probably is a good reason not to do that, and considering RwLock doesn't implement it either, it kind of makes sense.