In this series we've used AWS services to build our cloud-infrastructure. This is a list of the services we used with some information about that service. I'll also add some features that we didn't touch, but can be interesting for improvement of our environment.
Monitoring service for AWS resources. It is used to collect data about your services and applications and act according to rules you specify. We used it to scan and monitor log files and send an alarm or notification.
However, it can also actively fix a problem. It can for example trigger an auto-scaling group to scale out or in depending on CPU utilization or restart an unhealthy instance. It does this with rules, which you can also setup on schedule, like cron jobs.
The AWS "virtual server" solution. With EC2 you create instances that run an OS on which you run your applications. You can easily change the instance type if you need more or less computing resources. EC2 is also used to create instances that run your containers through ECS (more below).
One of the things that make EC2 instances cool is the auto-scaling feature. We didn't use that yet as our application design is not compatible. We need to move out the databases and configuration files.
The OS is stored on Elastic Block Storage. EBS is roughly analogous with virtual disks in a virtual server environment. Files stored on an EBS volume can only be read by a single instance. If you need to share files between instances, you need EFS.
Route 53 is the service that takes care of your domain names, DNS management and traffic management. We didn't use the last one.
You can register your domains directly with AWS through Route 53, which allows you to set up hosted zones inside AWS. This makes it much easier to add for example email records, as the service can automatically enter them into the hosted zone.
Service to send emails from your applications to the world. It gives you an SMTP address or you can use it directly. The latter can be done by sending variables to an SES template. We didn't touch that yet.
SNS sends notifications to email, phone or other applications. It is triggered by CloudWatch or another AWS service. It differs from SES in that it can only send plain text, where SES can send styled HTML emails.
S3 stands for simple storage service. It is an object storage solution with high durability. It is used to store static files that can be downloaded through an http(s) connection.
S3 is not meant for sharing files, between EC2 instances, that are often being modified. If you are running multiple instances that need to share data, use EFS.
Basically you try to store as much of your data as possible in S3, because it's cheap. If shared between instances, it's on EFS. If direct access is important, it's on the EBS volume.
S3 can be even cheaper if you have archive files. The Glacier class for example is ~80% cheaper for long term storage of files you probably never need anymore, but have to keep for compliance or piece of mind.
Email application to send and receive emails. Outlook, Exchange, Gmail equivalent.
We now build our own API in ASP.NET and run it in an EC2 instance. Maybe this service can offload that work. We do use it currently as a gateway to DynamoDB, where we store log data from our app.
CloudFront is the AWS CDN service. It caches your website/application files on edge locations near your customers. This reduces latency and makes your website load faster. You also get free SSL certificates with CloudFront.
NoSQL solution from AWS. I lnow very little about databases, so I don't know when and why you would use this. What I mean by that is that I'm not the person who can explain when to use SQL or NoSQL databases.
If you have containerized you applications, you should check this out. It basically manages your containers, so you don't have to do that. You do have to manage the EC2 instances ECS uses to run the containers on.
If even that is too much, you can use Fargate. With that service you don't even have to mange the EC2 instances.
We're seriously considering moving all our new applications to containers from the start and slowly refactor the existing applications too.
EFS is a shared NFS file system. If you need to share a file system between instances or containers?, this is the one to explore.
When using the auto-scaling function for EC2 instances or containers in ECS, you probably want to balance traffic or users between them. That is done with a load balancer.
The ALB (Application Load Balancer) type has another cool feature. It can route traffic based on the URL. It can do this by host-based routing and path-based routing. The former is used for routing traffic based on the subdomain. The latter when you want to route traffic based on the the folder.
If you use a lot of self-contained functions, you should definitely check out Lambda. It's a "serverless" calculation service, meaning you don't have to manage any underlying hardware like EC2 instances. You select the runtime, Java, Go, PowerShell, Node.js, C#, Python, Ruby, and upload your code. You then trigger the code by an external event: e.g. send an SNS, change a file in S3 or update a table in DynamoDB.
Some examples I'm thinking of is generating PDF's or emails based on a parameter file stored in S3, Dynamo table or SNS message send from application. Or a user triggered function like resizing, cropping and compressing images.
The main gain here is that all of the calculations are not done on your instances, freeing up CPU time for your core applications. The alternative would be to scale up your EC2 environment, adding costs all the time. With Lambda you only pay for the time the function runs, not when it's waiting for a user to actually run it.
Currently we run our databases on the EC2 compute instances next to the applications. Although this gives us full control, we also have to manage them. Updates, backups and performance are our responsibility.
With RDS you get a managed RDS instance (EC2 under the hood). AWS will take care of engine updates, backups and instance tuning. We didn't integrate an RDS instance because the MSSQL version is expensive.
AWS offers 6 SQL engines: Aurora, PostgreSQL, MySQL, MariaDB, Oracle and MS SQL server. Choosing MySQL over MSSQL Express can save you 20%. If you reserve them the difference is negligible. However, if you need more than the Express edition, the difference becomes substantial.
With Aurora AWS build their own database engine that is optimized for cloud usage. It runs MySQL and PostgreSQL databases. It can also be run serverless, which makes it possible to scale your database instance in and out depending on the current needs.
We have build a queue in our application. This queue holds things like emails to be send, all pdf's to be made and payment requests to be send. We run cron jobs to regularly check if something is in the queue and process it.
With SQS, we could offload user requests, which now end up in our queue. By storing them instead in SQS, we can use that to trigger a Lambda function.
I'm not sure how I would use them, but they seem to be perfect to create a workflow between different services and multiple Lambda functions. I'd image something like this:
A user wants to download this months data and sends a request by clicking a button. This:
- Triggers the step function when data is placed in S3/SQS/DynamoDB by our application.
- Lambda 1 checks if all files are available.
- If not, start Lambda 2 to generate PDF's from queue.
- When Lambda 2 finishes, rerun Lambda 1.
- If all files available, run Lambda 3 to compress all PDF's in single file.
- When Lambda 3 finishes, start Lambda 4 to generate email with SES and send file to customer.