AWS overview

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the AWS category.

Last Updated: 2024-09-19

Note that this article contains many ideas from the Manning book on AWS in Action. I highly recommend you buy it.

General points

Resilience

Choosing a region

Factors to consider - latency - how far to customers - availability of desired AWS services - not everything is available in every region. Use online tools to determine this. - legal issues - e.g. are you allowed to store data in country Y? - where your other AWS infrastructure is based. E.g. With DynamoDB, no additional traffic charges apply if you use access DynamoDB from ECs instances in the same region.

Each region consists of multiple availability zones (AZs). You can think of an AZ as an isolated group of data centers, and a region as an area where multiple availability zones are located at a sufficient distance. The region us-east-1 consists of six availability zones (us-east-1a to us-east-1f), for example. The availability zone us-east-1a could be one data center, or many (this is not public info)

The AZs are connected through low-latency links, so requests between different availability zones aren’t as expensive as requests across the internet in terms of latency. The latency within an availability zone (such as from an EC2 instance to another EC2 instance in the same subnet) is lower compared to latency across AZs.

Cloud Formation

S3

For maximum performance, choose keys with even distribution of characters in their prefixes. In S3, keys are stored in alphabetical order in an index. The key name determines which partition the key is stored in. If your keys all begin with the same characters, this will limit the I/O performance of your S3 bucket. Thus names like "image1.png", "image2.png" will underperform names like "ffsa-image1.png", "abaw-image2.png" that have an MD5 hash of the original key as a prefix.

Using a slash (/) in the key name acts like creating a folder for your object. If you cre-ate an object with the key folder/object.png, the folder will become visible as a folder if you’re browsing your bucket with a GUI like the Management Console, for example. But technically, the key of the object still is prefix-folder-name/object.png.

EC2

EC2 Instance Store

Elastic IPs

Elastic Block Store

Manual setup of EBS on an EC2 instance

On an EC2 you can see the attached EBS volumes using sudo fdisk -l. Usually, EBS volumes can be found somewhere in the range of /dev/xvdf to /dev/xvdp. The root volume (/dev/xvda) is an exception—it's based on the AMI you choose when you launch the EC2 instance, and contains everything needed to boot the instance (your OS files):

Elastic File System

Use-cases for EFS

Elastic Load Balancer (ELB)

Simple Queue Service (SQS)

Elastic Beanstalk

Features:

Nevertheless, it still gives you virtual machine you can log in to for debugging.

Relational Database Service: RDS

DynamoDB

["michael", 1] => {
    "uid": "michael",
    "tid": 1,
    "description": "prepare lunch"
  }
["michael", 2] => {
  "uid": "michael",
  "tid": 3,
  "description": "prepare talk for conference"
}

Security Group

It is possible to control network traffic based on whether the source or destination belongs to a specific security group. For example, you can say that a MySQL database can only be accessed if the traffic comes from your web servers, or that only your proxy servers are allowed to access the web servers. Because of the elastic nature of the cloud, you’ll likely deal with a dynamic number of virtual machines, so rules based on security groups scale better than those based on IP addresses etc.

This wasn't mentioned in the book, but the examples gave me the impression that it is more important to limit inbound ports and IPs. Many examples did nothing with outbound.

Jump Box concept

To implement the concept of a bastion host, you must follow these two rules:

It’s important that the bastion host does nothing but SSH, to reduce the chance of it becoming a security problem.

Use ssh -A to enable agent forwarding when you SSH into your jump box

Cloudtrail

Cloudwatch

Consists of the following: - metrics - (watches various metrics - network usage, disk usage, number of function invocations) - alarms - creates alarms when metrics over certain thresholds - logs - events - Whenever something changes in your infrastructure, an event is generated in near real-time. For example, CloudTrail emits an event for every call to the AWS API. AWS emits an event to notify you of service degradations or downtimes.

Typical alarm: You might set up an alarm to trigger if the 10-minute average of the CPUUtilization metric is higher than 80% for 1 out of 1 data points, and if the 10-minute average of the SwapUsage metric is higher than 67108864 (64 MB) for 1 out of 1 datapoints.

From queueing theory, utilization over about 80% if problematic since wait time is exponential to the utilization of a resource. Applies to CPU, Hard Disks, cashiers at a help desk. This occurs basically because not all requests for the resource happen at convenient, even times - i.e. they are bursty. In other words, when you go from 0% utilization to 60%, wait time doubles. When you go to 80%, wait time has tripled. When you to 90%, wait time is six times higher. And so on. So if your wait time is 100 ms during 0% utilization, you already have 300 ms wait time during 80% utilization, which is already slow for a e-commerce web site.

Amazon API Gateway

Lambda

Elastic Cache

IAM

Typical policy:

{
"Version": "2012-10-17",
"Statement": [{
  "Sid": "1",
  "Effect": "Allow",
  "Action": "ec2:*",
  "Resource": "*"
}]

This allows every action for the EC2 service, for all EC2 resources you have.

If you have multiple statements that apply to the same action, Deny overrides Allow. The following policy allows all EC2 actions except terminating EC2 instances:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "1",
    "Effect": "Allow",
    "Action": "ec2:*",
    "Resource": "*"
  }, 
  {
    "Sid": "2",
    "Effect": "Deny",
    "Action": "ec2:TerminateInstances", 
    "Resource": "*"
  }]
}

So far "*" has meant every resource. But we can be more specific with an Amazon Resource Number (ARN):

arn:aws:ec2:us-east-1:878533e33333:instance/i-3dd4f812

How to red this:

WARNING: You should never copy a user's access keys to an EC2 instance; use IAM roles instead.

There are various use cases where an EC2 (or lambda etc.) instance needs to access or manage other AWS resources.

For example, an EC2 instance might need to:

To be able to access the AWS API, an EC2 instance needs to authenticate itself. You could create an IAM user with access keys and store the access keys on an EC2 instance for authentication. But doing so is a hassle, especially if you want to rotate the access keys regularly. Instead of using an IAM user for authentication, you should use an IAM role whenever you need to authenticate AWS resources like EC2 instances. When using an IAM role, your access keys are injected into your EC2 instance automatically.

If an IAM role is attached to an EC2 instance, all policies attached to those roles are evaluated to determine whether the request is allowed.

Security generally

VPC

How to debug networking issues due to security with VPC Flow Logs

Say your EC2 instance does not accept SSH traffic as you want it to, but you can’t spot any misconfiguration in your firewall rules. In this case, you should enable VPC Flow Logs to get access to aggregated log messages containing rejected connections.

Options for deploying

CLI

Workflow (potentially break these out into sub-tips later)

Resources