spire-agent health check report spire_agent_rpc_workload_api_fetch_x509_bundles{status="PermissionDenied"} metrics
penghuazhou opened this issue · comments
pire-agent health check invoke this method: sendX509BundlesResponse, because spire-agent do not have a spire entry that select spire-agent pod, it will return codes.PermissionDenied. I will report this metrics: spire_agent_rpc_workload_api_fetch_x509_bundles{status="PermissionDenied"}, so when this exception occurs in a normal request for non health checks, I cannot distinguish between them.
func sendX509BundlesResponse(update *cache.WorkloadUpdate, stream workload.SpiffeWorkloadAPI_FetchX509BundlesServer, log logrus.FieldLogger, allowUnauthenticatedVerifiers bool, previousResponse *workload.X509BundlesResponse, quietLogging bool) (*workload.X509BundlesResponse, error) {
if !allowUnauthenticatedVerifiers && !update.HasIdentity() {
if !quietLogging {
log.WithField(telemetry.Registered, false).Error("No identity issued")
}
return nil, status.Error(codes.PermissionDenied, "no identity issued")
}
resp, err := composeX509BundlesResponse(update)
if err != nil {
log.WithError(err).Error("Could not serialize X509 bundle response")
return nil, status.Errorf(codes.Unavailable, "could not serialize response: %v", err)
}
if proto.Equal(resp, previousResponse) {
return previousResponse, nil
}
- Version: v1.9.6
- Platform: linux-amd64
- Subsystem: spire-agent
Thank you for opening this issue, @penghuazhou.
I think that the health check shouldn't interfere with the metrics. We already filter the logs when it's the PID of the agent itself the one that made the request. I think that we should do the same for the metrics.
I believe that we will need to handle this at the middleware level, preventing to emit metrics due to requests from the agent itself.