operator-framework / java-operator-sdk

Java SDK for building Kubernetes Operators

Home Page:https://javaoperatorsdk.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to handle CR and secret references set in a CR spec ?

derlin opened this issue · comments

TL;DR: how to support a one-to-many relationship between primary to secondary resource, in which the secondary resource doesn't know which primary resources depends on it ?

Take the https://external-secrets.io operator as an example, which fetches secrets from a secret manager (e.g. Hashicorp Vault) and creates Secret.

An ExternalSecret CR has a secretStoreRef in the spec, which references (with namespace and name) a (Cluster)SecretStore CR. The latter in turn has a reference (secretRef) to a Secret that holds the auth information to the secret manager.
When reconciling an ExternalSecret, we thus need to pull a SecretStore and a Secret. However, those two are not really managed, in the sense that they are not reconciled, just holding configuration information, and should never be created/updated/deleted by the operator.

I am trying to understand how to implement a similar mechanism with JOSDK. I tried using dependent resources, but the problem is, I cannot provide a SecondaryToPrimaryMapper as there is no way to know from a SecretStore which ExternalSecret may reference it. Same goes for Secret to SecretStore. Moreover, the SecretStore and Secret may be referenced by many ExternalSecret resources.

Does the SDK provide utilities for this use case, or do I need to manually do the lookup using the kubernetes client ? How would you implement this ?

related question: still using the external-secrets example, the ExternalSecret reconciler needs to generate a secret (implemented as a CRUDKubernetesDependentResource) and also to check some stuff in the vault. I thus need to access the SecretStore (and its related Secret) in both the reconciler's reconcile method, and in the dependent secret's desired method. I cannot find a way to share this SecretStore between the two. Is there a way to avoid fetching the config twice ? (I guess it is not a problem if the call is cached, but I struggle in understanding what is cached how. If I use the kubernetes client directly, I guess there is no caching, right ?

Let me know if something is unclear.

Hi @derlin, that's an interesting use case!

Let me first make sure I understand your problem correctly: I guess you're writing an operator reconciling ExternalSecret resources, correct? And you want your reconciler to be triggered whenever a change occurs to your SecretStore and Secret secondary resources? How is the existing operator working? Is it reconciling ExternalSecret instances periodically to pick up changes to SecretStore instances?

@metacosm thank you for the quick reply :)

Let me be more precise with my actual use case, which is to provision external databases.

description

I have an EdbConfig (external database config) with information on how to connect to a database. To avoid having passwords in clear in the spec, I have a credentialsSecretRef that points to a secret with username and password.

I have one reconciler for EdbConfig, which just checks for the validity of the information, and updates the status (isOk, lastError). This would better be done in an admission webhook, but we don't have support yet ;)

This if for the config.

Now, I have an Edb CR, which actually triggers a database provisioning. This Edb has a reference to an EdbConfig, which tells in which cluster to provision a db. This reconciler basically connects to the db using the config information, creates a database/user/etc. and stores the information in a generated/dependent secret implemented as a dependent resource of type CRUDKubernetesDependentResource. This secret can now be used by e.g. a service.

what I have so far

So far, I have been able to watch secrets in the EdbConfig reconciler by using the primarytosecondary example: https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework/src/test/java/io/javaoperatorsdk/operator/sample/primarytosecondary/JobReconciler.java. It works: I can get the associated secret, and a reconciliation is triggered when the credentials secret changes.

I am now stuck on the ExtDb reconciler. I want to be able to get the EdbConfig (and its associated secret) in both the reconciler's reconcile method, and in the desired and match methods of the dependent secret.

I thought of using the same approach in the reconciler, but how do I get both the EdbConfig and its secret ? And how about the dependent secret ?

note: the fact that a reconciliation of Edboccurs if the referenced EdbConfig changes is a stretch objective. The minimum requirement being to be able to fetch the config in an effective way from the Edb reconciler and in its dependent secret.

@metacosm thank you for the quick reply :)

Let me be more precise with my actual use case, which is to provision external databases.

description

I have an EdbConfig (external database config) with information on how to connect to a database. To avoid having passwords in clear in the spec, I have a credentialsSecretRef that points to a secret with username and password.

I have one reconciler for EdbConfig, which just checks for the validity of the information, and updates the status (isOk, lastError). This would better be done in an admission webhook, but we don't have support yet ;)

Have you taken a look at https://github.com/java-operator-sdk/admission-controller-framework? ;)

I thought of using the same approach in the reconciler, but how do I get both the EdbConfig and its secret ? And how about the dependent secret ?

Is your code accessible somewhere as it might be easier to help you if we can look directly at the code? At the very least, can you share how you configure each reconciler in code? The details are not important but being able to see how each reconciler is configured in particular with respect to how they relate to their dependent resources would be really helpful…

Unfortunately I can't make the code public, but here is an overview.

This is the ExtDbConfig reconciler, which uses the EventSourceInitializer to watch the credentials secret:

public class ExtDbConfigReconciler implements Reconciler<ExtDbConfig>, ErrorStatusHandler<ExtDbConfig>,
    EventSourceInitializer<ExtDbConfig> {

  private static final String CONFIG_CREDENTIALS_INDEX = "config-secret-index";

  @Override
  public UpdateControl<ExtDbConfig> reconcile(ExtDbConfig resource, Context context) {
    // ...
    var secret = ((Optional<Secret>) context.getSecondaryResource(Secret.class)).orElseThrow(() ->
        new ValidationException("Credentials secret not found:" + resource.getSpec().getCredentialsSecretRef()));
    // ...
  }

  @Override
  public Map<String, EventSource> prepareEventSources(EventSourceContext<ExtDbConfig> context) {

    context.getPrimaryCache().addIndexer(CONFIG_CREDENTIALS_INDEX, this::indexKey);

    InformerConfiguration<Secret> informerConfiguration =
        InformerConfiguration.from(Secret.class, context)
            .withSecondaryToPrimaryMapper(secret -> context.getPrimaryCache()
                .byIndex(CONFIG_CREDENTIALS_INDEX, indexKey(secret))
                .stream().map(ResourceID::fromResource).collect(Collectors.toSet()))
            .withPrimaryToSecondaryMapper(this::getSecretResourceId)
            .withNamespacesInheritedFromController(context)
            .build();

    return EventSourceInitializer
        .nameEventSources(new InformerEventSource<>(informerConfiguration, context));
  }

  private List<String> indexKey(ExtDbConfig extDbConfig) {
    var secretRef = extDbConfig.getSpec().getCredentialsSecretRef();
    return List.of(secretRef.getNamespace() + "#" + secretRef.getName());
  }

  private String indexKey(Secret secret) {
    return secret.getMetadata().getNamespace() + "#" + secret.getMetadata().getName();
  }

  private Set<ResourceID> getSecretResourceId(ExtDbConfig extDbConfig) {
    var secretRef = extDbConfig.getSpec().getCredentialsSecretRef();
    return Set.of(new ResourceID(secretRef.getName(), secretRef.getNamespace()));
  }
}

As for the ExtDb reconciler, I have a dependent secret:

@KubernetesDependent(labelSelector = LABEL_NAME + "=" + LABEL_VALUE)
@Slf4j
@ApplicationScoped
public class DependentSecret extends CRUDKubernetesDependentResource<Secret, ExtDb> implements SecondaryToPrimaryMapper<Secret> {

  public static final String LABEL_NAME = "...";
  public static final String LABEL_VALUE = "...";
  public static final String SECRET_SUFFIX = "-db";

  public DependentSecret() {
    super(Secret.class);
  }

  @Override
  protected Secret desired(ExtDb extDb, Context<ExtDb> context) {
    // 🛑 here I need access to the ExtDbConfig information !
  }

  @Override
  public Result<Secret> match(Secret actualResource, ExtDb extDb, Context<ExtDb> context) {
    // 🛑 here I need access to the ExtDbConfig information !
  }

  @Override
  public Set<ResourceID> toPrimaryResourceIDs(Secret dependentResource) {
    var name = dependentResource.getMetadata().getName();
    if (name.contains(SECRET_SUFFIX)) {
      return Set.of(new ResourceID(name.substring(0, name.length() - SECRET_SUFFIX.length()),
          dependentResource.getMetadata().getNamespace()));
    }
    return Set.of();
  }
}

And the reconciler:

@ControllerConfiguration(
    dependents = @Dependent(type = DependentSecret.class)
)
public class ExtDbReconciler implements Reconciler<ExtDb> {
  @Override
  public UpdateControl<ExtDb> reconcile(ExtDb extDb, Context<ExtDb> context) {
    // ...
    var secret = context.getSecondaryResource(Secret.class).orElseThrow();
    // 🛑 here I need access to the ExtDbConfig information ! 
    // ...
  }
}

Thank you, will take a closer look but it seems at first glance (and that's what I was thinking reading your description) that your problem might be solved by adding an ExtDbConfig dependent to your ExtDbReconciler and possibly use the workflow feature that we introduced in 3.1 to prevent your ExtDb from being reconciled if the ExtDbConfig is not valid. If you do that, I think you should be able to access the ExtDbConfig secondary resource from the context object that is being passed to your SecretDependent. You could also probably get rid of the ExtDbConfigReconciler altogether.

wow, awesome !
And what about the credentials secret, how can I access it efficiently (caching would be great) ?
Can you give me some pointers on how to implement this workflow exactly ?

The workflow feature is describe here: https://javaoperatorsdk.io/docs/workflows
Let me know if things are unclear so that we can improve the doc if needed… 😄
You can also take a look at https://github.com/java-operator-sdk/java-operator-sdk/tree/main/operator-framework/src/test/java/io/javaoperatorsdk/operator/sample/workflowallfeature but it might be a little complex to understand what's going on just from that 😓
The Quarkus extension also has a sample that uses the workflow feature:
https://github.com/quarkiverse/quarkus-operator-sdk/blob/main/samples/exposedapp/src/main/java/io/halkyon/ExposedAppReconciler.java#L21-L26
In this particular case, the Ingress dependent won't be reconciled until the Service is ready and the ExposedApp reconciler won't be triggered until the Ingress is ready either…

Thank you for the pointers, I will have a look and get back to you !

Just one detail: the config will become a dependent resource with a condition, this I get. But what about the credentials secret ? This one is referenced by the config, and dependent resources cannot have other dependent resources, right ? So how do you handle the credentials secret ?

And how to avoid the Provide a SecondaryToPrimaryMapper to associate this resource with the primary resource when dealing with dependent resources ? (I feel I am back to square one ^^)

Well, a dependent can depend on other dependents via the dependsOn relation that you can add to your dependent configuration… Of course, it's not the same as having multiple levels but you can somewhat emulate this organisation that way. I need to look at your code closer and play with it, it might end up that what you're trying to achieve isn't currently feasible with dependents… I don't know for sure at this point.

I am already having trouble, and this is actually what triggered my issue. When using a dependent resource, I always get this error:

Provide a SecondaryToPrimaryMapper to associate this resource with the primary resource

Which I cannot provide.

Any possibility to simply share the cache between the config reconciler and the db reconciler, without using dependent resources ?

Caches are currently tied to event sources (this is actually something we're thinking about changing but we need to figure out how to design it cleanly) so there's a cache per resource type. Maybe we could expose them via the Context object?
Actually, what would your ideal solution to your problem look like, code-wise?

Tough question, I would need to think about that. But for now, having a shared cache would be enough. Thinking of it, I could implement my own cache, as I have a reconciler for the config that gets called whenever the config or associated secret changes. Furthermore, there would be only a handful: 5 max.

The only thing is, the config reconciler needs to run prior to the extdb reconciler. I am not sure how to handle this (an extdb runs before the config is processed). I could simply reschedule the reconciliation for later, hoping a small delay would be enough, or actually query the kubernetes api (it shouldn't happen often). But what if there the config ref is wrong ?
What do you think ?

Another possible way to explore is to look at adding an indexer to the cache as done in https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework/src/test/java/io/javaoperatorsdk/operator/sample/primaryindexer/DependentPrimaryIndexerTestReconciler.java? I'll try to dig deeper into your use case tomorrow.

Hello @metacosm ,
Just to keep you updated before the weekend. I tried the DependentPrimaryIndexer example:

@Slf4j
@ControllerConfiguration(
    dependents = @Dependent(name = "config", type = ReadOnlyConfigDependent.class)
)
public class ExtDbReconciler implements Reconciler<ExtDb>, ErrorStatusHandler<ExtDb> {

  public static final String CONFIG_RELATION_INDEXER = "config-indexer";

  @Override
  public UpdateControl<ExtDb> reconcile(ExtDb resource, Context context) {
    var config = (ExtDbConfig) ((Optional<ExtDbConfig>) context.getSecondaryResource(ExtDbConfig.class, "config"))
        .orElseThrow(() -> new RuntimeException("Configuration doesn't exist"));
    // ...
  }

  protected static final Function<ExtDb, List<String>> indexer =
      resource -> List.of(resource.getSpec().getConfigRef().getNamespace() + "#" + resource.getSpec().getConfigRef().getName());

  protected static final Function<ExtDbConfig, String> indexerConfig =
      resource -> resource.getMetadata().getNamespace() + "#" + resource.getMetadata().getName();

  public static class ReadOnlyConfigDependent extends KubernetesDependentResource<ExtDbConfig, ExtDb>
      implements SecondaryToPrimaryMapper<ExtDbConfig> {
    private IndexerResourceCache<ExtDb> cache;

    public ReadOnlyConfigDependent() {  super(ExtDbConfig.class);}

    @Override
    public Set<ResourceID> toPrimaryResourceIDs(ExtDbConfig dependentResource) {
      return cache.byIndex(CONFIG_RELATION_INDEXER, indexerConfig.apply(dependentResource))
          .stream()
          .map(ResourceID::fromResource)
          .collect(Collectors.toSet());
    }

    @Override
    public EventSource initEventSource(
        EventSourceContext<ExtDb> context) {
      cache = context.getPrimaryCache();
      cache.addIndexer(CONFIG_RELATION_INDEXER, indexer);
      return super.initEventSource(context);
    }
  }
}

It doesn't work. I nearly every time get a Configuration doesn't exist exception over and over (but the first resource installed usually works, then all other fail). If I restart the operator though (and the resources are already present), then it works.


I also tried another take: I cache my config in the ExtDbConfigReconciler, and then I use the cache in my ExtDbReconciler. For this to work, I need to manage the dependent secret myself (i.e. I need to call reconcile manually once I know the config exists). The code for the db reconciler thus looks like:

@ControllerConfiguration
public class ExtDbReconciler implements Reconciler<ExtDb>,
    ErrorStatusHandler<ExtDb>,
    EventSourceInitializer<ExtDb> {

  final ConfigCacher configCacher;
  final DependentSecret dependentSecret;
  // ...
  final KubernetesClient kubernetesClient;

  @Override
  public UpdateControl<ExtDb> reconcile(ExtDb extDb, Context<ExtDb> context) throws Exception {

    try {
      var config = configCacher.getOrThrow(extDb.getSpec().getConfigRef());
      dependentSecret.reconcile(extDb, context);
      var secret = context.getSecondaryResource(Secret.class).orElseThrow();
      // ...
    } catch (ConfigException e) {
      log.warn("{}. Rescheduling.", e.getMessage());
      return UpdateControl.<ExtDb>noUpdate().rescheduleAfter(500L);
    }
  }

  @Override
  public Map<String, EventSource> prepareEventSources(EventSourceContext<ExtDb> context) {
    dependentSecret.setKubernetesClient(kubernetesClient); // if not set, a NullPointerException is thrown during reconcile
    return EventSourceInitializer.nameEventSources(dependentSecret.initEventSource(context));
  }
}

Note that in the DependentSecret code, I now I can call the cache.getConfig since it has been checked by the reconciler first.

This works well, except during startup: the config reconciler is slower than the db one, and thus the cache is empty at first. I don't know how to handle this. Also, in this version, the db reconciler doesn't get called if the config changes.

Let me know what you found out on your side !

So I've been looking into this a little bit more… I will try to replicate the issue, locally. If you could extract a minimal project that could exhibit the issue, that would help as well!

In the mean time, am I right into thinking that the whole process is triggered when a user creates an ExtDbConfig resource? Or are ExtDb and ExtDbConfig completely independent?

The configs are independent in the sense that they will be created separately, mostly by the admins.
The trigger is the ExtDb, which references a config.
The only important réconcilier is thus the one of the ExtDb, which reads the config (and its secret), and use it to create a db and a secret.
I still haven't found an elegant way. I will create a simple project tomorrow (it's late where I am) and share it with you on github, so we can put our brains on something concrete!

Here is an example: https://github.com/derlin/josdk-operator-dependent-crs
On the main branch, I am using a ConfigCacher. As you will see, there are lots of edge cases and problems:

  • on startup, the db CRs are reconciled first, and need to be rescheduled as the config is not in the cache (thus, when the config is missing it can be either because it really doesn't exist, or because it hasn't been processed yet. So how do you discriminate the two ?)
  • the db CRs are not reconciled on config change
  • multiple edge cases as the config cache and the upstream are not perfectly in sync

etc. So overall, I would rather do it differently, but no idea how.

EDIT: I added the branch primary-indexer which is based on the test of the same name (https://github.com/java-operator-sdk/java-operator-sdk/blob/main/operator-framework/src/test/java/io/javaoperatorsdk/operator/sample/primaryindexer/DependentPrimaryIndexerTestReconciler.java): https://github.com/derlin/josdk-operator-dependent-crs/tree/primary-indexer

You will see that it works when the operator starts and the config/db are already on the cluster, but if I try to create a db at runtime, the config will always be null. Is this a bug of the framework or did I miss something ? (also note that on this branch, I haven't even tried to fetch the credentials secret yet, just the config).

@derlin thank you! I will take a look at it next week.

Great. In the meantime I found a pseudo solution by changing the spec. If, instead of having a reference to the secret in the config I use the rule "the credentials secret must have the same name, and be in the same name space as the config", I can get both config and secret from the ExtDb reconciler. A basic implementation can be found on the branch primary-to-secondary https://github.com/derlin/josdk-operator-dependent-crs/tree/primary-to-secondary.

However, I also had to add a label rule to the secret, to avoid watching all secrets from all namespaces (over the top just for one or two secrets that will not change often).

Let me know if you see another, better way to come around this problem.

The "dependent resources are named after the primary resource" trick strikes again… 😄
That is indeed a very common and relatively easy way to deal with dependents, that said, that probably doesn't address the issue of getting the dependents from the caches directly… I'll take a look starting Monday.

Well, given I can guess the ressource Id of both from the primary using this trick, I am able to initialize event sources (for both) in the primary reconciler (and thus use the cache). I will let you have a look at the code, let's discuss then :)

Started looking at your code in greater details, this is more or less what I understood so it's good :)
One question I have, though: why is the secret not created by the operator?
Somewhat distantly related: have you ever looked at Service Binding?

We are talking about the credentials secret which cannot be created by the operator since the operator needs it to get the root username and password of the external db cluster (it create secrets for applications though). This secret is an input to the operator, not a byproduct.

For service binding, I read briefly about it when it was announced. From what I understood, it could only be useful to configure the apps - as outputs of the operator (and even though not sure it would add a lot compared to a secret, especially since it would require a change in our whole codebase).
But for configuring the operator itself, as service binding will mount secrets in the operator pod it would mean they need to exist prior to the operator start. In my use case, the config CRs (and credentials secret) are meant to be provided, changed, etc at runtime, and the operator needs to react properly to these changes without a restart. Does it make sense?

We are talking about the credentials secret which cannot be created by the operator since the operator needs it to get the root username and password of the external db cluster (it create secrets for applications though). This secret is an input to the operator, not a byproduct.

Makes sense. Was thinking more about a db cluster provisioning scenario where the whole thing was handled by the operator, including setting up the DB instance itself.

For service binding, I read briefly about it when it was announced. From what I understood, it could only be useful to configure the apps - as outputs of the operator (and even though not sure it would add a lot compared to a secret, especially since it would require a change in our whole codebase). But for configuring the operator itself, as service binding will mount secrets in the operator pod it would mean they need to exist prior to the operator start. In my use case, the config CRs (and credentials secret) are meant to be provided, changed, etc at runtime, and the operator needs to react properly to these changes without a restart. Does it make sense?

Yes. Was just curious if you knew about the spec and how it related to your use case.

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

Can we really consider it closed?

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

This issue was closed because it has been stalled for 14 days with no activity.