--

AWS Collector

Like all Soluble collectors, the aws-collector is deployed as a docker container.

It will interrogate the AWS API control plane and build a graph model of your AWS infrastructure.

The collector only needs read-only access to your AWS account.

Quick Start

The easiest way to try the aws-scanner is to use docker compose. We don't advocate using compose for day-to-day activities, but it is convenient to get started with a working configuration.

This quickstart will create three containers that work together:

  1. Soluble Dashboard UI - https://localhost.soluble.ai:8443
  2. AWS Collector - Scans the account set up in ${HOME}/.aws
  3. Neo4j Database - http://localhost.soluble.ai:7474
curl \
https://raw.githubusercontent.com/soluble-ai/soluble/master/quickstart/aws/docker-compose.yml | \
docker-compose -f - up

You may want to look at the docker-compose.yml itself before running.

After a minute or two of downloading the images and starting, you will be able to log into the dasboard at:

https://localhost.soluble.ai:8443

The username and password will default to admin / admin.

The scanner will begin to make API calls to AWS, process the results and build a graph in Neo4j which you can see in the dashboard.

Notes

  1. Most soluble configuration is passed via env variables. GRAPH_URL, GRAPH_USERNAME, and GRAPH_PASSWORD are examples of this.
  2. We use a volume-mapping to provide the AWS credentials to the container. The AWS credentials in $HOME/.aws directory are mapped into /app/.awswhich is the $HOME/.aws inside the container. Passing environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY works as well. If you pass no credeentials at all, the dashboard will try to obtain credentials through the instance role available to the container (assuming it is running inside AWS).

Start Exploring With Cypher!

If you want to get a quick visualization of all your stuff, you can run the the following cypher query in the browser and poke around:

match (a) where a.graphEntityGroup='aws' return a limit 100;

If you want to look at something more targeted, like all your VPCs and the things that are directly connected to them:

match (v:AwsVpc)--(b) return a,b;

This would show you all your EC2 Instances and the entities that they are directly connected to:

match (a:AwsEc2Instance)--(b) return a,b;

You can make your Cypher queries arbitrarily complex and Neo4j will comply.

This query will show all the AMIs in use for your ec2 instances, the creation date of that AMI and the number of EC2 instances using that AMI. Very useful for vulnerability assessments!

match (a:AwsAmi)--(b:AwsEc2Instance) return
a.imageId,a.description,a.creationDate,count(b);

Here is a query that, given a known AMI, coughs up the Auto-scaling groups that are using it it.

match (a:AwsAmi {imageId:'ami-0a2abab4107669c1b'})--
(b:AwsEc2Instance)--
(g:AwsAsg) return g.name;

The power starts to come into focus as you traipse around the graph. For instance, if I have known vulnerability in an AMI (ironically this AMI has a vulnerability in Docker that affects Kubernetes), you can easily trace its usage back to questions that might be important:

Is it serving traffic that is public facing? What service is it handling? What is the IP and/or DNS name of the exposed URL?

Good times.

Configuration

The following options can be passed as environment variables.

Env Variable Required? Description Example
GRAPH_URL Required URL of Graph Database bolt://myserver:7687
GRAPH_USERNAME - Database username
GRAPH_PASSWORD - Database password
AWS_REGIONS - Comma-seaparated list of regions to be scanned. If omitted, only the "current" region will be scanned. us-east,us-west-2
GRAPH_ENTITY_EXCLUDES - Comma-separated list of graph entity types that should be excluded from the scan. AwsRdsInstance, AwsApiGatewayRestApi

Supported Types

Type Scanner
AwsAccount AccountScanner
AwsAmi AmiScanner
AwsApiGatewayRestApi ApiGatewayRestApiScanner
AwsAsg AsgScanner
AwsAvailabilityZone AvailabilityZoneScanner
AwsCacheCluster ElastiCacheScanner
AwsCacheClusterNode ElastiCacheScanner
AwsEc2Instance Ec2InstanceScanner
AwsEgressOnlyInternetGateway EgressOnlyInternetGatewayScanner
AwsEksCluster EksClusterScanner
AwsEmrCluster ElasticMapReduceClusterScanner
AwsEmrClusterInstance ElasticMapReduceClusterScanner
AwsElb ElbClassicScanner
AwsElbTargetGroup ElbTargetGroupScanner
AwsHostedZone Route53Scanner
AwsHostedZoneRecordSet Route53Scanner
AwsInternetGateway AwsInternetGatewayScanner
AwsLambdaFunction LambdaFunction
AwsLaunchConfig LaunchConfigScanner
AwsLaunchTemplate LaunchTemplateScanner
AwsRdsCluster RdsCluster
AwsRdsInstance RdsInstanceScanner
AwsRegion RegionScanner
AwsRouteTable RouteTable
AwsSecurityGroup SecurityGroupScanner
AwsSnsTopic SnsScanner
AwsSnsSubscription SnsScanner
AwsSqsQueue SqsScanner
AwsElbListener ElbListenerScanner
AwsS3Bucket S3Scanner
AwsSubnet SubnetScanner
AwsVpc VpcScanner
AwsVpcEndpoint VpcEndpointScanner
AwsVpcPeeringConnection VpcPeeringConnectionScanner
AwsVpnGateway VpnGatewayScanner

Development

Naming Conventions

Entity and Attributes

  • All AWS entities should begin with Aws . For example AwsAccount, AwsRegion, AwsEc2Instance, etc.
  • All AWS nodes should have a region attribute. The values of region should be the lower case region names: us-east-1, us-west-2, etc.
  • All attributes should use lower camel case.
  • arn should use used wherever possible. ARN's are globally unique. Names are typically unique only by account-region pairs. IDs tend to be region-unique, but have no guarantees of uniqueness across regions.
  • Unique index for arn should be created if arn attribute is used.
  • All AWS nodes should have graphEntityType set to the label of the node and graphEntityGroup=aws. The graphEntityType property makes it easier for the code to know what kind of node it is using.
  • Be very careful to qualify all non-unique attributes by account and region. For instance, if you are looking up or modifying an AwsElb, with {name:'foo'}, be aware that this could match an ELB in any account or region. It should be restricted with addition pattern matching attributes: {name:'foo', account:'1111111111', region:'us-east-1}
  • The jackson Object-to-Json converter is used for most of the JSON serialization. The AWS Java classes are auto-generated by Amazon in the SDK from the API specification. This is a somewhat indirect representation of the underlying API, but in practice it works very well.

Relationships

  • Create only directed relationships. Use active verb relationship names where possible. Passive relationship names are OK when the active direction doesn't make a lot of sense.
  • When things contain other things, use the relationship name HAS
  • When an entity makes reference to another entity, but doesn't "own" it: USES or ATTACHED_TO. For instance, (e:AwsEc2Instance)-[USES]->(s:AwsSecurityGroup)

Writing an AWS Scanner

  • First, understand that the AWS APIs are charmingly inconsistent. They are clearly all written by the same company and are mostly consistent with naming and such. However there is a lot of subtle variation, including:

    • Some APIs use pagination. Some don't. The tokens that they use for pagination have different names.
    • Some refer to their entities by ARN. Others by id. Others by name.
    • Exceptions and behavior for entities that aren't found varies widely.
    • Performance varies widely across APIs. Some hit rate limits easily. Others don't.
    • etc.
  • We are able to eliminate a lot of boilerplate from the scanners, but a fair amount remains.

  • All AWS scanners should all extend AwsEntityScanner<T, C> where T is the type of the primary entity that they handle, and C is the AWS client type. This just helps the resulting code to provide strongly typed methods that eliminate needless casting and boilerplate.

  • AwsEntityScanner instances are locked to an account-region pair.

  • The underlying AwsScanner takes care of sharing client connections. In general you should simply call AwsEntityScanner.getClient( &<builderClass>) to get an AWS client of your choice. Soluble will take care of re-using that client.

  • All scanners need to reference an AwsEntityType enum type that should match with the label name in Neo4j. For instance AwsEc2Instance, AwsVpc, etc.

    In general, doScan() should keep going when exceptions are encountered. It is encouraged to use tryExecute() to handle exceptions correctly as it loops through returned items.

  • When single entities are scanned, exceptions should NOT be caught. The thinking here, is that if you are scanning a specific item and it fails, you probably want to know. However, if you are scanning a whole account-region, you want as much of the data as possible.

  • It is recommended to use the GraphBuilder DSL for most graph mutations rather than writing cypher directly. It eliminates a lot of error-prone boilerplate.

Dealing With Deletion

Dealing with deleted items is one of the tricker parts of maintaining the graph. What has worked best is a 2-phase operation. When a full scan of a given type in an account-region tuple is complete, there is code that loads all the entities for that type for that given account-region pair that have graphUpdateTs before the operation started.

We don't know for sure that those items have been deleted, but we are able to interrogate API and AWS will tell us for certain.

To find out if the entity has been deleted or not, we need to make a single get or describe operation for that entity. If AWS responds definitively that the item does not exist, we are free to delete it from the graph.

There is some complexity with the multi-tenant data model. We need to make sure that we are only operating on items from the account-region tuple that our scanner is configured to use. Calling describe or get on an entity from another account or region would cause AWS to return with some variant of NotFound, which we might cause us to delete an entity in our account-region. That is, AWS correctly responds that the item does not exist, but it was our error for using the wrong account.