Looking for service level metrics (RunningTaskCount, PendingTaskCount, DesiredTaskCount)
silk-bahamut opened this issue · comments
Question
I'm looking to replace the Container Insight standard configuration in an ECS cluster with a custom one with aws-otel-collector.
I've found many explanation on how to configure all the metrics but my problem is that I wish to find Service level metrics.
Here are all the metrics that I found and all seems to be at tasks level.
In this list, my main interests are:
- RunningTaskCount
- PendingTaskCount
- DesiredTaskCount
is there any way to access these metrics with the aws-otel-collector?
Tests
I tried starting my cluster with the side car defined as below:
{
"name": "aws-otel-collector",
"image": "amazon/aws-otel-collector",
"cpu": 0,
"portMappings": [],
"essential": true,
"command": [
"{{command}}"
],
"environment": [],
"mountPoints": [],
"volumesFrom": [],
"secrets": [
{
"name": "AOT_CONFIG_CONTENT",
"valueFrom": "arn:aws:ssm:eu-central-1:123456789:parameter/bru/adot/config"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "loggroup-adot-test",
"awslogs-region": "eu-central-1",
"awslogs-stream-prefix": "metrics"
}
},
"healthCheck": {
"command": [
"/healthcheck"
],
"interval": 5,
"timeout": 6,
"retries": 5,
"startPeriod": 1
},
"systemControls": []
},
And I saw many new metrics in cloudwatch:
{
"view": "timeSeries",
"stacked": false,
"metrics": [
[ "ECS/ContainerInsights", "container.cpu.cores" ],
[ ".", "container.cpu.onlines" ],
[ ".", "container.cpu.reserved" ],
[ ".", "container.cpu.usage.vcpu" ],
[ ".", "container.cpu.utilized" ],
[ ".", "container.memory.reserved" ],
[ ".", "container.memory.usage" ],
[ ".", "container.memory.usage.limit" ],
[ ".", "container.memory.usage.max" ],
[ ".", "container.memory.utilized" ],
[ ".", "container.network.rate.rx" ],
[ ".", "container.network.rate.tx" ],
[ ".", "ecs.task.cpu.cores" ],
[ ".", "ecs.task.cpu.onlines" ],
[ ".", "ecs.task.cpu.reserved" ],
[ ".", "ecs.task.cpu.usage.vcpu" ],
[ ".", "ecs.task.cpu.utilized" ],
[ ".", "ecs.task.memory.reserved" ],
[ ".", "ecs.task.memory.usage" ],
[ ".", "ecs.task.memory.usage.limit" ],
[ ".", "ecs.task.memory.usage.max" ],
[ ".", "ecs.task.memory.utilized" ],
[ ".", "ecs.task.network.rate.rx" ],
[ ".", "ecs.task.network.rate.tx" ]
],
"region": "eu-central-1"
}
which seems to match the documentation but doesn't help me find how to get the 3 ones I'm missing.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.