ECS Anywhere (on Raspberry PI)

The Amazon Elastic Container Service allows you to run docker container based workload in an AWS managed compute environment. In ECS, you have task definitions, which serve as a blueprint for tasks, defining the containers to run, resource requirements (such as CPU and RAM), disk attachments, or networking setup. A task can then be run from this blueprint on an ECS cluster. An ECS cluster is a definition of how to run workloads, i.e., on EC2 machines or ECS Fargate. In this article we will take a look at the third option, EXTERNAL, which is also referred to as ECS Anywhere.

ECS Anywhere, what does it mean?

Basically, the answer to this question is easy. You just add your own, possibly on-premise, servers to an ECS Cluster. But how does this actually work? And what are the potholes to navigate? And when would you actually use it?

ecs-anywhere

The Workings

ECS Anywhere overall works very much like ECS on EC2. When you define a cluster with an EC2 instance, EC2 launches an instance for you using a pre-baked AMI. There in one elementary service that connects the instance to ECS, the ECS agent. The ECS agent is packed into the AMI, thus the launched instance can connect to the ECS cluster. Of course, there are other services that ECS needs, mainly a docker runtime. Seems reasonable, as ECS essentially runs containers.

But there is a little more if you want to use your own servers. Now the AWS Systems Manager (SSM) comes into play. In order to manage your servers in AWS, you need to integrate them somehow. SSM allows you to do exactly that. You can add your own servers to be managed as if they were in AWS, at least to some degree. To enable this, you need to install the SSM Agent on your servers. The SSM agent also allows you to use IAM roles for your own on-premise servers.

But don’t fret, all of this is automated in a provided script that automates a lot of the setup for you. We will take a look at this a little later in this article.

Common pitfalls

When you are running external instances in general they cannot be in a (virtual private cloud) VPC. If you need a shared network, you will need to set up AWS Direct Connect or AWS Site-to-Site VPN and set up proper routing, e.g., via AWS Transit gateway. This becomes very important when you are running Amazon RDS databases in a VPC. Typically, you don’t want a database to be accessible via internet, so you will need to find a way to make them accessible, e.g., using an AWS service or by building a custom solution.

In an AWS VPC you can also use security groups and network ACLs to control access to resources on the network. As mentioned, external instances cannot join a VPC, so security groups and network ACLs are not available. Any network access restrictions have to be implemented in your on-premise networking infrastructure, i.e., your on-premise firewall.

Oftentimes you also want to run some auto scaling web application behind a load balancer. AWS provides different load balancers to achieve this. Unfortunately, the AWS load balancers cannot be used for external instances. You can resort to tools like traefik. Traefik’s ECS plugin can lookup ECS services and configure load balancing accordingly. See also this AWS blog post on how to do this.

Use Cases

Why should I use this service, you may ask. “In my environment I don’t have any external servers, should I buy some?”, and the answer is: no, and this service is most likely not for you.

Some reasons you might have to use it are networking restrictions, or you already have an on-premise data center (and want to integrate with and possibly migrate to AWS ECS). Maybe you also have some data locality requirements that dictate that the data has to be stored on-premise. But there could also be more reasons.

Imagine you want to run some kind of monitoring and controlling the processes in a production plant. The machines you want to monitor potentially offer a REST endpoint, but you don’t want them to be accessible via internet. You can run a server in the plant and configure it as an external instance in your already existing ECS cluster. This way, the server runs in your factories network and would not be accessible via the internet. Tasks running on this server would be able to connect to your machines to perform their work and on the other hand be able to use AWS services as if they were running in AWS themselves. So those tasks can, e.g., scrape metrics from your machines and then publish those metrics in Amazon CloudWatch, authorized via the task’s IAM role.

As the server runs in your on-premise environment, tasks running on the server can also access other on-premise resources like file systems. With the help of instance attributes in ECS, you can also make sure that tasks are only run on specific instances, e.g., only on instances in a specific factory by setting an attribute to the factory id. This way, you can make sure that tasks are only run on instances in the same factory as the machines they are monitoring using placement constraints.

Let’s try it out

Now that we know what ECS Anywhere is and how it works, lets try it out. We will set up a Raspberry PI as an external instance in an ECS cluster and run a simple task on it.

Prepare the Raspberry PI

First, we need to set up the Raspberry PI. We will use a Raspberry PI 4 with 8GB of RAM. The Raspberry PI will run Ubuntu Server 22.04. We recommend using the Raspberry PI Imager to flash the image to the SD card. In the Imager, we can select the Ubuntu Server 22.04 OS and also configure the Wi-Fi settings. This way, we don’t need to connect the Raspberry PI to a monitor and keyboard to configure the Wi-Fi. After flashing the image, we can insert the SD card into the Raspberry PI and power it on.

Using Ubuntu Server 22.04., we need to enable memory cgroups. To do so, we need to edit the file /boot/firmware/commandline.txt and add the following options to the end of the line:

cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory systemd.unified_cgroup_hierarchy=0

Prepare the AWS environment

Now that we have the Raspberry PI up and running, we can prepare the AWS environment. First we will create a CDK stack with a VPC and an ECS cluster:

export class EcsAnywhereClusterStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
    const vpc = new ec2.Vpc(this, "FargateVpc", {
      maxAzs: 1,
    });

    const cluster = new ecs.Cluster(this, "EcsAnywhereCluster", {
      vpc: vpc,
      clusterName: "EcsAnywhereCluster",
    });
  }
}

Register the Raspberry PI as an external instance to the ECS cluster

After deploying the stack, we can see the cluster in the ECS console. Now we need to register the Raspberry PI as an external instance in the cluster. In the ECS console we can select the cluster and then click on “Register external instances” in the Infrastructure-tab. This will open a dialog were we configure the activation (how long the activation key is valid and for how many external instances it can be used). Finally, this dialog generates a script that we can run on the Raspberry PI. The script will install the SSM agent and the ECS agent and register the Raspberry PI as an external instance in our cluster. After running the script, we can see the Raspberry PI in the ECS console.

Create and run the task on the Raspberry PI

Now that we have the Raspberry PI registered as an external instance in our cluster, we can create a task definition and the service. We will create a new stack for this.

export class EcsAnywhereServiceStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);
  }
}

Before we start with the actual task definition, we should care about the correct IAM permissions. Since we need to add the corresponding roles to the task definition later, we need to add them in our EcsAnywhereServiceStack first:

const bucket = new s3.Bucket(this, "MyS3Bucket", {
  blockPublicAccess: s3.BlockPublicAccess.BLOCK_ALL,
  encryption: s3.BucketEncryption.S3_MANAGED,
  enforceSSL: true,
});

const taskRole = new iam.Role(this, "EcsAnywhereTaskRole", {
  assumedBy: new iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
});

taskRole.grantPut(bucket);

const executionRole = new iam.Role(this, "EcsAnywhereExecutionRole", {
  assumedBy: new iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
});

In case of ECS, there are two roles that you can attach: the execution role and the task role. The execution role usually requires a managed policy called service-role/AmazonECSTaskExecutionRolePolicy, which grants read access to the Elastic Container Registry (ECR), for example, for pulling the image, and write access to CloudWatch Logs. The execution role is assumed by ECS itself to run our task. So there are a few things to make sure in general:

  1. ECS has to be able to pull the image. If your image is private in ECR, the execution role has to grant permissions to access it.
  2. Writing logs to Amazon CloudWatch Logs requires to grant permissions to forward your task’s console outputs.

The task role permissions depend on your task. If you want to read objects in S3, for example, you would need the corresponding S3 policy. Remember, you should always follow the least privilege principle and grant only the least permissions necessary.


Now we can create the task definition and add our currently created roles to it. Please note that you should set the compatibility to EXTERNAL and not FARGATE or EC2. Otherwise, the task will not be able to run on the Raspberry PI.

const taskDefinition = new ecs.TaskDefinition(
  this,
  "EcsAnywhereTaskDefinition",
  {
    family: "WriteToS3",
    compatibility: ecs.Compatibility.EXTERNAL,
    cpu: "256",
    memoryMiB: "128",
    taskRole,
    executionRole,
  }
);

Great, now we can add a container to the task definition. We will use CDK’s DockerImageAsset, which will build and push a docker image to the CDK managed ECR repository. We also add some port mappings:

const asset = new assets.DockerImageAsset(this, "MyBuildImage", {
  directory: path.join(__dirname, "..", "..", "service"),
});
asset.repository.grantPull(executionRole);

const logGroup = new logs.LogGroup(this, "WriteToS3LogGroup", {
  logGroupName: "WriteToS3",
  retention: logs.RetentionDays.ONE_DAY,
});

const serviceContainer = taskDefinition.addContainer("service", {
  image: ecs.ContainerImage.fromDockerImageAsset(asset),
  essential: true,
  logging: new ecs.AwsLogDriver({
    streamPrefix: "WriteToS3",
    logGroup,
  }),
});

serviceContainer.addPortMappings({
  containerPort: 80,
  hostPort: 0,
  protocol: ecs.Protocol.TCP,
});

Last but not least, we can create the service with our formerly created task definition and run it on the Raspberry PI. Note that we need to use ExternalService instead of FargateService or Ec2Service:

const service = new ecs.ExternalService(this, "WriteToS3Service", {
  cluster,
  taskDefinition,
  desiredCount: 1,
});

Conclusion

Setting up ECS Anywhere on servers with supported operating systems is pretty straightforward, thanks to AWS providing a streamlined script that automates the entire setup process. This script takes care of installing and configuring both the SSM and ECS Agents on your machine.

Ensuring that cgroups are enabled is a requirement which most operating systems fulfill by default. However, on Raspberry Pi, it’s crucial to explicitly enable this in the boot configuration.

An exciting facet of ECS Anywhere is the seamless integration with CDK, offering full support for task definitions when running services on external instances. This allows for a smooth transition from running services on external instances to running them on EC2 instances in the future.

What we find most useful, though, is that AWS IAM permissions can be used also for tasks that run on external instances. This empowers your services to seamlessly interact with any AWS service using IAM authorization. In our example we mentioned the monitoring of machines in a factory (or even many factories). Now, envision crafting a service that retrieves telemetry information from machines within a local network and effortlessly sends the data to CloudWatch, mirroring the behavior as if the service were running natively in AWS.

Moreover, leveraging attributes in tandem with the DAEMON scheduling strategy allows us to ensure that every external instance in each factory (provided they have a custom attribute like location = factory) executes one of these monitoring tasks. This strategic use of attributes adds a layer of precision and control to the deployment of monitoring tasks across varied external instances.

In summary, ECS Anywhere emerges as an ideal solution for bridging the gap between on-premise servers and the cloud, offering a seamless pathway for gradual migration and integration into the expansive AWS ecosystem over time.

photo of Anne

Anne is a Cloud Consultant at superluminar. With her passion for software development and everything to do with the cloud, she is always striving to learn more about the latest technologies and trends and to expand her skills. In this blog, she shares her insights and her knowledge on AWS-specific topics.

photo of Robert

Robert is a Cloud Consultant at superluminar and has the AWS Certified Data Analytics Specialty. He writes here about AWS-specific topics.