How to get the IP Address for a specific AWS ECS task?

I am attempting to build my own version of service discovery within ECS, since the services that I wish to scale up and down are not HTTP servers and so cannot be managed by ELB. Also, ECS doesn't yet support the user-defined networks feature of docker which would be another way of doing service discovery. As mentioned in that issue discussion:

Currently Service Discovery is a huge pain requiring yet another service (which itself is usually cluster-based and self-discovers and then listens for other services). It's a messy solution, not to mention the Lambda "solutions" that are even more obnoxious to implement and maintain.

So I'm going the obnoxious Lambda "solution" route in lieu of other options. The main thing I need to build this hack service discovery is the IP address of each of the docker containers running on my EC2 hosts.

By SSH'ing into the EC2 server acting as one of my ECS container instances, I can run docker ps to get the container ids for each running docker container. For any given containerId, I can run docker inspect ${containerId} which returns JSON including many details about that container, in particular the NetworkSettings.IPAddress bound to that container (the main thing I need for my discovery implementation).

I am trying to use the AWS SDK from within Lambda to get this value. Here's my Lambda function so far (you should be able to run this too - nothing specific to my setup here):

exports.handler = (event, context, callback) => {
    var AWS = require('aws-sdk'),
        ecs = new AWS.ECS({"apiVersion": '2014-11-13'});

    ecs.listClusters({}, (err, data) => {
        data.clusterArns.map((clusterArn) => {
            ecs.listTasks({
                cluster: clusterArn
            }, (err, data) => {
                ecs.describeTasks({
                    cluster: clusterArn,
                    tasks: data.taskArns
                }, (err, data) => {
                   if (err) console.log(err, err.stack); 
                   else     console.log(JSON.stringify(data, null, 4));
                })
            });
        })
    })
};

The output from the describeTasks call is pretty useless. It doesn't have nearly as much detail as the docker inspect call produces, in particular it does not include the IP Address of the docker container running the task.

I have also attempted to find the data I need via the describeContainerInstances call, but as expected that did not return any task-specific details.

I would be willing to try running docker inspect directly on the EC2 host, if there was any way to do so from Lambda. I'm not sure if it is possible to run commands on the container via the SDK; probably not. Therefore I would have to build a custom service running on a specially-made version of the ECS container image, which sounds terrible.

How might I go about getting these container IP addresses using the AWS SDK? Or some better idea about how to solve the general problem of service discovery in ECS?

Solution 1:

It turns out my original premise (needing to know the task container's own internal IP address for service discovery) is very flawed - that IP address is only usable within a single EC2 Container Instance. If you have multiple container instances (which you probably should have) then those task container IP's are basically useless.

The alternative solution that I came up with is to follow the pattern suggested for Application Load Balancers running HTTP / HTTPS - have a port mapping with 0 as the host port, pointing to the port within the docker instance that I need to use. By doing this, Docker will assign a random host port, which I can then find by using the AWS SDK - in particular, the "describeTasks" function available on the ECS module. See here for details: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/ECS.html#describeTasks-property

This is the fundamental basis for my roll-your-own service discovery mechanism - there are a lot of other details necessary to do this in a complete fashion. I used Lambda functions calling out to the AWS SDK as well as a PostgreSQL database in order to keep my list of host containers up to date (something a bit like a dynamic DNS registry). Part of the trick is you need to know the IP and port for each of the containers, but describeTasks only returns the port. Here is a handy NodeJS function that I wrote which takes a container name and looks for all IP addresses and ports found within the cluster for containers of that name:

var Q = require('q');
/**
 * @param {String} - cluster - name of the cluster to query, e.g. "sqlfiddle3"
 * @param {String} - containerType - name of the container to search for within the cluster
 * @returns {Promise} - promise resolved with a list of ip/port combinations found for this container name, like so:
    [
      {
        "connection_meta": "{\"type\":\"ecs\",\"taskArn\":\"arn:aws:ecs:u..\"}",
        "port": 32769,
        "ip": "10.0.1.49"
      }
    ]
 *
 */
exports.getAllHostsForContainerType = (cluster, containerType) => {
    var AWS = require('aws-sdk'),
        ecs = new AWS.ECS({"apiVersion": '2014-11-13'}),
        ec2 = new AWS.EC2({"apiVersion": '2016-11-15'});

    return ecs.listTasks({ cluster }).promise()
    .then((taskList) => ecs.describeTasks({ cluster, tasks: taskList.taskArns }).promise())
    .then((taskDetails) => {
        var containersForName = taskDetails.tasks
            .filter((taskDetail) =>
                taskDetail.containers.filter(
                    (container) => container.name === containerType
                ).length > 0
            )
            .map((taskDetail) =>
                taskDetail.containers.map((container) => {
                    container.containerInstanceArn = taskDetail.containerInstanceArn;
                    return container;
                })
            )
            .reduce((final, containers) =>
                final.concat(containers)
            , []);

        return containersForName.length ? (ecs.describeContainerInstances({ cluster,
            containerInstances: containersForName.map(
                (containerDetails) => containerDetails.containerInstanceArn
            )
        }).promise()
        .then((containerInstanceList) => {

            containersForName.forEach((containerDetails) => {
                containerDetails.containerInstanceDetails = containerInstanceList.containerInstances.filter((instance) =>
                    instance.containerInstanceArn === containerDetails.containerInstanceArn
                )[0];
            });

            return ec2.describeInstances({
                InstanceIds: containerInstanceList.containerInstances.map((instance) =>
                    instance.ec2InstanceId
                )
            }).promise();
        })
        .then((instanceDetails) => {
            var instanceList = instanceDetails.Reservations.reduce(
                (final, res) => final.concat(res.Instances), []
            );

            containersForName.forEach((containerDetails) => {
                if (containerDetails.containerInstanceDetails) {
                    containerDetails.containerInstanceDetails.ec2Instance = instanceList.filter(
                        (instance) => instance.InstanceId === containerDetails.containerInstanceDetails.ec2InstanceId
                    )[0];
                }
            });
            return containersForName;
        })) : [];
    })
    .then(
        (containersForName) => containersForName.map(
            (container) => ({
                connection_meta: JSON.stringify({
                    type: "ecs",
                    taskArn: container.taskArn
                }),
                // assumes that this container has exactly one network binding
                port: container.networkBindings[0].hostPort,
                ip: container.containerInstanceDetails.ec2Instance.PrivateIpAddress
            })
        )
    );
};

Note that this uses the 'Q' promise library - you'll need to declare that as a dependency in your package.json.

The rest of my custom solution for handing ECS service discovery using Lambda functions can be found here: https://github.com/jakefeasel/sqlfiddle3#setting-up-in-amazon-web-services

Solution 2:

You can associate Classic Elastic Load Balancer with ECS service even if your services are not HTTP. Ensure you create TCP listener (not HTTP or HTTPs/SSL) on the ELB and point at the exposed port of your container . The disadvantage of using Classic ELB vs Application ELB is that you will have to have a separate ELB for each ECS service (additional cost).

http://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-listener-config.html

DigitalOcean's object storage, Spaces, is randomly slow on retrieving small files

Strongswan: "received NO_PROPOSAL_CHOSEN error notify" while connecting to Cisco ASA

Doubts about DKIM verification (RFC6376)

Linux self recovering DHCP after really long downtime?

After upgrade from Debian 10 Buster to Debian 11 Bullseye security updates 404 not found

what is the relation between block size and IO?

Enable IKE tracing on windows 10 VPN

How to set environment variables for the CI/CD in GitLab when using Auto DevOps (with GCP Kubernetes)?

How to dismiss nginx warning message nginx: [alert] could not open error log file: open() "/var/log/nginx/error.log" failed (13: Permission denied)

Force all files & subdirs that BY ANY MEANS come to be in/under a directory to inherit all permissions and group

How to plan an Active Directory domain rename without it killing me?