ECS task role set on ECS agent startup command isn't applied to flow's run configuration

I’m deploying on AWS ECS and my ECS Agent is running the following command:

                "prefect",
                "agent",
                "ecs",
                "start",
                "--execution-role-arn",
                "arn:aws:iam::XXXX:role/prefect-test-service-role",
                "--task-role-arn",
                "arn:aws:iam::XXXX:role/prefect-test-task-role",
                "--cluster",
                "prefect-test"

However when flows are created in ECS they do not seem to have the role associated with them (causing permission issues).

I also tried to add the role onto the Flow directly using the task_role_arn parameter of ECSRun but that also didn’t change anything.

(In either case the execution role gets applied to the flow correctly)

How can I debug why the task role isn’t being applied by the agent when launching a flow?

1 Like

This is the recommended approach since each of your flows may need a different set of permissions. Could you share your full flow definition and how you register it?

Are you on Prefect Cloud or Server? Can you share the output of prefect diagnostics?

I could point you to resources about ECS:

If this didn’t help, LMK and please share as much info about your setup as possible

Thanks, I’ll look at those links. I think since the task role is defined in ECSRun it should be put through the task definition, but doesn’t seem to be. Here is my setup:

Prefect Diagnostics Output

{
  "config_overrides": {},
  "env_vars": [],
  "system_information": {
    "platform": "macOS-12.2.1-x86_64-i386-64bit",
    "prefect_backend": "cloud",
    "prefect_version": "1.1.0",
    "python_version": "3.10.2"
  }
}

Flow config

STORAGE_CONFIG = GitHub(
            repo="XXXX/workflows",                            # name of repo
            path="workflows/marketing/test/test/flow.py",                    # location of flow file in repo
            access_token_secret="personal access token")

RUN_CONFIG = ECSRun(labels=['test'],
        image="XXXX.dkr.ecr.us-west-2.amazonaws.com/workflows/test:latest",
        task_role_arn='arn:aws:iam::XXXX:role/prefect-test-task-role'
        )
with Flow("Test", storage=STORAGE_CONFIG, run_config=RUN_CONFIG) as flow:
    ...

if __name__ == "__main__":
    flow.register(project_name="marketing")

Flow Registration

I had issues registering but following this issue was able to get it to work.

python -m workflows.marketing.test.test.flow

ECS Agent task definition

This is the ECS task definition for the container running the Agent

{
    "taskDefinitionArn": "arn:aws:ecs:us-west-2:XXXX:task-definition/prefect-test:4",
    "containerDefinitions": [
        {
            "name": "prefect",
            "image": "prefecthq/prefect:latest-python3.8",
            "cpu": 0,
            "links": [],
            "portMappings": [],
            "essential": true,
            "entryPoint": [],
            "command": [
                "prefect",
                "agent",
                "ecs",
                "start",
                "--execution-role-arn",
                "arn:aws:iam::XXXX:role/prefect-test-service-role",
                "--task-role-arn",
                "arn:aws:iam::XXXX:role/prefect-test-task-role",
                "--cluster",
                "prefect-test"
            ],
            "environment": [
                {
                    "name": "PREFECT__CLOUD__AGENT__LABELS",
                    "value": "['test']"
                },
                {
                    "name": "PREFECT__CLOUD__API",
                    "value": "https://api.prefect.io"
                },
                {
                    "name": "PREFECT__CLOUD__AGENT__LEVEL",
                    "value": "INFO"
                }
            ],
            "environmentFiles": [],
            "mountPoints": [],
            "volumesFrom": [],
            "secrets": [
                {
                    "name": "PREFECT__CLOUD__API_KEY",
                    "valueFrom": "arn:aws:ssm:us-west-2:XXXX:parameter/prefect/ecs_agent/api_key"
                }
            ],
            "dnsServers": [],
            "dnsSearchDomains": [],
            "extraHosts": [],
            "dockerSecurityOptions": [],
            "dockerLabels": {},
            "ulimits": [],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "prefect-test",
                    "awslogs-region": "us-west-2",
                    "awslogs-stream-prefix": "prefect-test"
                },
                "secretOptions": []
            },
            "systemControls": []
        }
    ],
    "family": "prefect-test",
    "taskRoleArn": "arn:aws:iam::XXXX:role/prefect-test-task-role",
    "executionRoleArn": "arn:aws:iam::XXXX:role/prefect-test-service-role",
    "networkMode": "awsvpc",
    "revision": 4,
    "volumes": [],
    "status": "ACTIVE",
    "requiresAttributes": [
        {
            "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
        },
        {
            "name": "ecs.capability.execution-role-awslogs"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
        },
        {
            "name": "com.amazonaws.ecs.capability.task-iam-role"
        },
        {
            "name": "ecs.capability.secrets.ssm.environment-variables"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
        },
        {
            "name": "ecs.capability.task-eni"
        }
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2",
        "FARGATE"
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "512",
    "memory": "1024",
    "registeredAt": "2022-04-19T02:06:54.479Z",
    "registeredBy": "XXXX",
    "tags": []
}

Flow task definition

When a new flow run is sent the following task definitions get automatically created. It contains the execution role but not the task role.

{
    "taskDefinitionArn": "arn:aws:ecs:us-west-2:XXXX:task-definition/prefect-test-dde22515-bf71-4f9a-bf6e-b4bbe23cf06d:1",
    "containerDefinitions": [
        {
            "name": "flow",
            "image": "XXXX.dkr.ecr.us-west-2.amazonaws.com/workflows/test:latest",
            "cpu": 0,
            "portMappings": [],
            "essential": true,
            "environment": [
                {
                    "name": "PREFECT__CONTEXT__IMAGE",
                    "value": "XXXX.dkr.ecr.us-west-2.amazonaws.com/workflows/test:latest"
                }
            ],
            "mountPoints": [],
            "volumesFrom": []
        }
    ],
    "family": "prefect-test-dde22515-bf71-4f9a-bf6e-b4bbe23cf06d",
    "executionRoleArn": "arn:aws:iam::XXXX:role/prefect-test-service-role",
    "networkMode": "awsvpc",
    "revision": 1,
    "volumes": [],
    "status": "INACTIVE",
    "requiresAttributes": [
        {
            "name": "com.amazonaws.ecs.capability.ecr-auth"
        },
        {
            "name": "ecs.capability.execution-role-ecr-pull"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
        },
        {
            "name": "ecs.capability.task-eni"
        }
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2",
        "FARGATE"
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "1024",
    "memory": "2048",
    "registeredAt": "2022-04-19T23:22:00.584Z",
    "deregisteredAt": "2022-04-19T23:22:01.651Z",
    "registeredBy": "arn:aws:sts::XXXX:assumed-role/prefect-test-task-role/46032a0d17944ebf94c5003385879843",
    "tags": []
}

Prefect Task Role

There are two policies associated with the role

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ecr:GetAuthorizationToken",
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        }
    ]
}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CreateSecurityGroup",
                "ec2:CreateTags",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVpcs",
                "ec2:DeleteSecurityGroup",
                "ecs:CreateCluster",
                "ecs:DeleteCluster",
                "ecs:DeregisterTaskDefinition",
                "ecs:DescribeClusters",
                "ecs:DescribeTaskDefinition",
                "ecs:DescribeTasks",
                "ecs:ListAccountSettings",
                "ecs:ListClusters",
                "ecs:ListTaskDefinitions",
                "ecs:RegisterTaskDefinition",
                "ecs:RunTask",
                "ecs:StopTask",
                "iam:PassRole",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:DescribeLogGroups",
                "logs:GetLogEvents"
            ],
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "ssm:DescribeParameters",
                "ssm:GetParameters",
                "ssm:GetParametersByPath",
                "ssm:GetParameter"
            ],
            "Resource": "*",
            "Effect": "Allow"
        }
    ]
}

Prefect Task Execution Role

Contains two policies listed below

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ecr:GetAuthorizationToken",
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        }
    ]
}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:DescribeLogStreams"
            ],
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Action": "ssm:DescribeParameters",
            "Resource": "*",
            "Effect": "Allow"
        },
        {
            "Action": [
                "ssm:GetParameters",
                "ssm:GetParametersByPath",
                "ssm:GetParameter"
            ],
            "Resource": "arn:aws:ssm:us-west-2:XXXX:parameter/prefect/*",
            "Effect": "Allow"
        }
    ]
}

I found that if I specify the task_definition parameter to ECSRun it works. No idea why specifying it the other way does not but unblocks me for now.

1 Like

Thanks for sharing more details; that helps! What may be happening is that the task role ARN provided to the agent start command is only considered for the agent, not for the flow runs, and the task role provided on the task definition, in general, is considered for all containers spun up as part of this ECS task definition. I remember also having to explicitly specify that when using S3 storage. There is already some involved logic on the ECS agent to merge arguments supplied to the agent, task definition, and run config. Explicitly specifying them on the run config is always best.

Great to hear, nice work! :clap: