Control Tower

Control enables you to set up and govern your multi-account AWS environment.

This is a very high level walkthrough of setting up Control Tower.

To keep things clean, I created a brand new account from which to launch Control Tower. This account will become the Control Tower management account and payer account.

You will need an email address for each account that the landing zone creates. For a lab, I found it useful to use the following concepts: Some email providers, including gmail, will accept emails of the format myemailaddress+xxx@gmail.com, where xxx is any string, for example an AWS account name. AWS also accepts this format. This way a single email account can be used.

Select the Control Service and click “Set up a Landing Zone”

You are prompted for a home region. The home region should be the region where you do the most admin work and run your workloads. The audit and other buckets are created in the same AWS Region from which you launch AWS Control Tower. By keeping your workloads and logs in the same AWS Region, you reduce the cost that would be associated with moving and retrieving log information across regions.

Landing Zone will create two more accounts, Log archive and Audit. You supply a unique email for each. You then see the following message:

For each of the accounts, you will receive an email and an SSO portal URL in the format:

https://d-123456789.awsapps.com/start

At the end of the process you will see:

I chose “Configure your account factory”. A service catalog portolio will be created. I then used account factory to set an account called “Dev”.  This will be where I will start creating resources.

 

WildRydes Real Time Data Processing

There is an awesome project here  to process real-time data streams. It is a great way to learn the basics of the Kinesis family. As always, here I summarise the project but also add my own comments, hints and tips.

It is based on the fictional WildRydes ride sharing organisation, where customers can order a ride on one of our fleet of unicorns.

Each unicorn has been fitted with a sensor which sends back location and health information once per second to our operations centre.

The first step is to configure a Kinesis Data Streams. We supply a name, and configure the number of shards. In this case, 1 shard will suffice.

As is the case for many projects which involve some code, the project creates a Cloud9 IDE as an environment to run a supplied producer script to send data to the stream. If you have not used Cloud9 before that’s no problem. Instructions are supplied and the IDE is available in minutes.

Once running the producer script, a consumer script displays the once per second output from each unicorn.

A supplied dashboard, running somewhere in AWS, can graphically display the movement of the unicorns. We supply the dashboard with a Cognito Identity Pool ID to give it unauthenticated access to the stream.

In Kinesis, you can monitor the stream. After a while the number of incoming records and Bytes should become stable.

It can be tricky to interpret the data. Hover the mouse over the text that says “Incoming Records Limit” and click the “x” sign to remove it, to display the above amount of incoming records. Hover the mouse over the “IncomingRecords” to see that it is 300 over the last 5 minute interval. This screenshot was taken when there was 1 unicorn flying generating data at 1 second intervals, so that makes 300 records in 5 minutes.

A similar graph shows the number of Bytes:

Which is about 58K over a 5 minute interval, the flat bit above.

Well you could leave the project there, and if you have not seen Kinesis Data Streams before you already have an idea of what it can do.

If you want to continue, you create an Amazon Kinesis Data Analytics application to read from the Amazon Kinesis stream, and calculate the total distance travelled for each Unicorn.

Referring to the architecture diagram above, you specify the input data stream, and a second data stream for the output.

The application automatically discovers the “schema”. We copy and paste some SQL code to aggregate and transform the data to calculate the total distance travelled and send that once per minute to the output stream.

A supplied consumer script shows the results of second stream:

Notice the time. Subsequent records will appear exactly on the minute.

The next part of the project is to create a Lambda, triggered by the output stream above, which writes the records to DynamoDB as they arrive.

Here I am using the console based query editor to query for the per minute items for a particular Unicorn and date.

The last part of the project is to demonstrate Firehose, where we want to save the raw sensor data from the initial stream into S3, for later analyses or ad-hoc queries using Athena.

We create a Firehose Delivery Stream, selecting the original source stream containing the per second sensor data, and a bucket for the output, specifying a frequency of delivery, where we choose 60 seconds. So every 60 seconds, a new file containing the per second data will be created in S3.

To demonstrate using Athena for ad-hoc queries, we create an “external table” which is telling Athena about the format of the data in S3.

Now we can do queries from within the Athena console. Here are a few ideas I tried out:

select distinct name FROM wildrydes
select count(distinct name) FROM wildrydes

select * FROM wildrydes where name='Shadowfax'

select * FROM wildrydes where healthpoints<160
select name,statustime FROM wildrydes where healthpoints<160
select name,statustime,healthpoints FROM wildrydes where healthpoints<160 order by statustime desc
select name,statustime,healthpoints FROM wildrydes where name='Shadowfax' and healthpoints<160 order by statustime desc

The project has a clean up step, as alway. You do *not* want to leave the Kinesis stuff running! However, what I found is that the Lamba/S3/DynamoDB is, as always, nearly zero cost for this type of project, so if you want to come back to the project quickly and leave some things in place:

  • Delete the Firehose Delivery Stream.
  • Delete the Data Analytics App
  • Delete the 2 Data Streams

Then when coming back to the project, the whole thing can be quickly brought up in a few steps:

  • Recreate the 2 Data Streams.
  • Start Cloud9 and the producer script
  • Recreate the Kinesis Analytics Application
  • Delete and recreate the Lambda trigger
  • Recreate the Firehose Delivery Stream

Note:

Currently, due to the pandemic, the fleet of Unicorns are not actually flying, so for the moment, all the above incoming data is only simulated.

Sorry about that.

 

 

WildRydes and AWS Amplify

I recently noticed that one of my favorite AWS tutorials has been updated to use AWS Amplify instead of an S3 bucket to host the static part of a serverless web application.

The rest of the application uses the same architecture as it did before, in other words API gateway is used to front the dynamic part of the site and invokes a Lambda function which writes to DynamoDB.

The tutorial can be found here:

https://aws.amazon.com/getting-started/hands-on/build-serverless-web-app-lambda-apigateway-s3-dynamodb-cognito/

From the Amplify page:

AWS Amplify is a set of tools and services that can be used together or on their own, to help front-end web and mobile developers build scalable full stack applications, powered by AWS. With Amplify, you can configure app backends and connect your app in minutes, deploy static web apps in a few clicks, and easily manage app content outside the AWS console.

You have the option of using an autogenerated domain name like this:

https://master.d27ic7jm0hwx2o.amplifyapp.com

or you can use a custom domain name in Route 53 which maps onto a supplied CloudFront name like this (Amplify integates with CloudFront behind the scenes):

wildrydes.thetrainit.com CNAME Simple
d2uobbjz8282xe.cloudfront.net

It also integrates with CodeCommit/Github, so that every time an update to the code is pushed, Amplify will provision, build, deploy and verify the application.

 

Reserved Instances

The Architecting on AWS course discusses these purchasing options:

  • On demand
  • Reserved
  • Spot
  • Dedicated Instance
  • Dedicated Host

This blog has been running on a t2.nano for about 2 years, using on-demand at about 5$ per month.

I though it was time to purchase a reserved instance.

First I stopped the instance, took a snapshot, and upgraded it to a t3.nano. It is recommended to use the latest generations.

The I went to the Reserved Instances menu on the EC2 console:

Now it will cost $2.88 per month. A huge saving!

Note: There is now a more flexible replacement for RIs called “Savings Plans”, but at the time of writing this is not covered in the course or Associate Exam.

 

AD Connector

 

AD Connector is designed to give you an easy way to establish a trusted relationship between your Active Directory and AWS. When AD Connector is configured, the trust allows you to:

  • Sign in to AWS applications such as Amazon WorkSpaces, Amazon WorkDocs, and Amazon WorkMail by using your Active Directory credentials.
  • Seamlessly join Windows instances to your Active Directory domain either through the Amazon EC2 launch wizard or programmatically through the EC2 Simple System Manager (SSM) API.
  • Provide federated sign-in to the AWS Management Console by mapping Active Directory identities to AWS Identity and Access Management (IAM) roles.

In this walkthrough, I demonstrate the use case of seamlessly joining Windows instances to your on premises AD.

To keep it simple, rather than use an on-premises AD, this will be simulated using an EC2 instance. In real life, you would need a VPN or Direct Connect.  I used a Windows 2012 R2 Base, with an instance type of t2.medium, choosing the default VPC and subnet in AZ A. In real life it would have a static IP address, but the test worked fine using the dynamic private address.

For the Security Group, which in real life would be the on premises firewall, and just for the short duration of the test, I opened up all inbound traffic. See the link below for the actual required ports, which need to allow DNS, Kerberos and LDAP from the CIDR ranges of the AD Connector subnets.

Once ready, RDP to the public address of the instance and configure it as a domain controller.

As it is a while since I worked with AD, I followed the following article.

https://social.technet.microsoft.com/wiki/contents/articles/12370.windows-server-2012-set-up-your-first-domain-controller-step-by-step.aspx

I used the domain name onprem.com.

Create a service account which will be used by the AD connector. Follow the instructions here:

https://docs.aws.amazon.com/directoryservice/latest/admin-guide/prereq_connector.html

Alternatively, you could use a Domain Admin account when creating the AD connector later, but creating a service account with the minimum necessary privileges is best practice.

Now for main part of the walkthrough: creating an AD Connector.

In the AWS console, choose Directory Service, Set up a Directory, AD Connector, Small option. Note that its about $0.05 an hour, but free trail eligible for a month. I chose the default VPC and subnets in AZ A and AZ B.

Supply the Directory name onprem.com and Netbios name onprem.

Supply the IP address of the Domain Controller

Supply the Service Account Username and Password that you created earlier.

It takes about 5 minutes to create the AD Connector.

Now to test it by joining a new Windows Server to the on-premises domain.

Launch a Windows instance. I used 2012 R2 Base again, launching it into the default VPC, subnet in AZ A.

The interesting bit is on the Configure Instance page, where you choose Domain Join Directory and see the onprem.com domain name available. Note that it say that for the domain join to succeed, select an IAM role that has the AWS managed policies AmazonSSMManagedInstanceCore and AmazonSSMDirectoryServiceAccess attached to it.  You can choose to have it create the role you, and supply a name. I called it ADConnectorRole.

I tagged the instance “Member Server”

You can now RDP into it using the credentials of a Domain User account. I used onprem\administrator.

To clean up:

Delete the AD Connector Directory

Terminate the member server and the domain controller

 

 

Private Link

In the Advanced Architecting course, there is a section on Private Link

The text in the course:

AWS PrivateLink is a highly available, scalable technology that enables you to privately connect your VPC to supported AWS services, services hosted by other AWS accounts (VPC endpoint services), and supported AWS Marketplace partner services.

You do not have to have an Internet gateway, NAT device, public IP address, AWS Direct Connect (DX) connection, or VPN connection to communicate with the service. Traffic between your VPC and the service does not leave the Amazon network. With PrivateLink, endpoints are created directly inside of your VPC using elastic network interfaces and IP addresses in your VPC’s subnets.

To use AWS PrivateLink, create an interface VPC endpoint for a service in your VPC. This creates a network interface in your subnet with a private IP address to serve as an entry point for traffic destined to the service.

This lab will focus on one of the use cases in the slide: “Enables you to privately connect your VPC to supported AWS services”. I will use the service EC2 for the test. In other words, I will access the EC2 APIs using interface endpoints. One way of testing that is to use the CLI command:

aws ec2 describe-interfaces.

As a pre-requisite to the lab, I launched an Amazon Linux 2, t2.micro instance in the default VPC in AZ-A in region eu-west-1, with SSH allowed by the Security Group. I gave it an Admin role and used “aws configure” to configure the default region. There are many ways of achieving the same thing. The goal is simply to be able to issue CLI commands.

To keep the lab as simple as possible, I am using a public subnet. However, the same idea applies to private subnets, where an instance would be able to access the AWS APIs without using an internet gateway or NAT.

To test accessing the APIs without interface endpoints, issue the command:

aws ec2 describe-instances.

It should work.

To see the IP address that the command is using to access the APIs, we need to know the names of the AWS service endpoints. The are documented here:

https://docs.aws.amazon.com/general/latest/gr/ec2-service.html

To see the IP address being resolved, issue the command:

dig ec2.eu-west-1.amazonaws.com 

<output omitted> 

;; QUESTION SECTION: ;ec2.eu-west-1.amazonaws.com. IN A 

;; ANSWER SECTION: ec2.eu-west-1.amazonaws.com. 6 IN A 176.32.118.30


Note that the dig returns a public IP. In other words, the EC2 APIs are being accessed over the internet.

Now to create the endpoint:

Choose services, VPC, Endpoints, Create Endpoint, select com.amazonaws.eu-west-1.ec2, or equivalent for your region.

Choose the default VPC, and, to keep it simple, select the subnet in AZA. in real life, select more than one subnet for high availability. An ENI will be created in each subnet that you choose.

Leave “Enable DNS names” selected. For the Security Group, select or create one which allow all taffic in and out. This SG is associated with the interface endpoint and controls access to the ENI.

For Policy, leave it at full access. This controls which user or service can access the endpoint.

Click Create Endpoint

It will be pending for a couple of minutes.

You can look at the details of the endpoint to see the private IP of the created ENI.

From the instance, repeat the ec2 describe-instances command. It should still work as before, but the traffic is now going over the private link.

Repeat the dig command, to see output similar to:

;; QUESTION SECTION:
;ec2.eu-west-1.amazonaws.com. IN A

;; ANSWER SECTION:
ec2.eu-west-1.amazonaws.com. 60 IN A 172.31.45.146

Note that a private address is being returned.

The traffic stays on the AWS network, is more secure, and takes a more optimal path.

An interface endpoint is about $0.01 an hour.

To clean up.

Delete the Interface Endpoint.

File Gateway

As discussed in several AWS courses, Storage Gateway enables your on-premises applications to use AWS cloud storage, for backup, disaster recovery or migration.

In this lab, we will use File Gateway with NFS shares. There will be no requirement for anything on premises, as we will deploy File Gateway as an EC2 instance.

The lab assumes knowledge of how to create a bucket, an EC2 instance, and to SSH into the instance.

The lab takes about half an hour.

Decide on the region you will use. I used eu-west-1.

Create an S3 bucket to serve as the storage location for the file share.

I called it “filegateway-<my initials>”, taking all the defaults, in other words just choose Create Bucket.

Deploy an Amazon Linux 2 t2.micro instance to be used for the NFS client. This would normally be on premises. I took all the defaults, that is, using the default VPC, and gave it a tag Key:Name, Value:NFS Client. For the security group, allow SSH as we will log into it in order to mount the File Gateway share.

Deploy and configure the file gateway appliance as an EC2 instance as follows. In real life, you would normally deploy it as a virtual appliance on ESXi or Hyper-V.

From the console, choose Service, Storage Gateway and click Get Started if prompted.

Select File Gateway, Next, Select EC2 and choose Launch Instance. Select the t2.xlarge instance type. Refer to the documentation for the officially supported instance types. If you choose t2.micro, you will likely errors later when activating the gateway.

On configure instance details, I took all the defaults in order to use the default VPC.

On the storage screen, select Add New Volume and take all the defaults except for enabling “Delete on Termination” so that its not left lying around when we clean up. The device name defaulted to /dev/sbd. The size defaulted to 8GB. In real life it is recommeded to be a minimum of 150GB. This disk will be used for upload cache storage.

I tagged the instance with a Name of “File Gateway Appliance” in order to keep track of it.

For the security group, add some rules as follows. In real life, please consult the documentation.

Ingress port 22 is not needed for this lab, but can be used to directly interact with the file gateway appliance for troubleshooting.
Ingress port 80 from your web browser. This is for gateway activation and can be removed after activation.
Ingress port 111  and 2049 from the NFS client. These are for NFS file sharing data transfer. For the allowed source, I used 172.31.0.0/16, because the client is in the default VPC.
I left the outbound rules as default, which will allow Egress 443 for communication with the Storage Gateway service.

When the instance is ready, note the public IP address.

Return to the Storage Gateway browser tab and click Next.

Select the Public endpoint and choose Next

On the Connect to gateway screen, Paste the public IP address of the file gateway, then choose Connect to Gateway.

On the Activate Gateway screen, give the gateway a name: I chose “File Gateway Appliance”. Choose Activate Gateway.

It takes a few seconds to configure the local disk to use for upload cache storage.

On the Configure local disks screen, you may get a warning that the Recommended minimum disk capacity is not met. It is recommended to be at least 150GB. This disk is used for upload cache.

Choose Configure Logging and Disable logging, Save and Continue.

When the status of the gateway is Running:

Choose Create File Share, configure the name of the name of the bucket created earlier, choose NFS, then Next.

On the Configure how files are stored in Amazon S3 screen, select S3 standard and choose Next.

A role will be created for the gateway to access the bucket.

On the review screen, you may see a warning that the share will accept connections from any NFS client. To limit the share to certain clients, choose Edit next to Allowed clients, and edit the CIDR block. I used 172.31.0.0/16 as the client is in the default VPC. Choose Create file share and wait for it to change to be available. This takes about a minute.

Select the share, and you will see the command to mount the file share on Linux, similar to this:

mount -t nfs -o nolock,hard 172.31.42.177:/filegateway-smc [MountPath]

Log in to the NFS client and create a directory to use to sync data with the bucket.

sudo mkdir -p /mnt/nfs/s3

Mount the file gateway file share using the mount command displayed earlier, replacing the mount path with the path above, for example:

sudo mount -t nfs -o nolock,hard 172.31.42.177:/filegateway-smc /mnt/nfs/s3

If this fails, it is likely to do with the security group rules created earlier.

Verify the share has been mounted:

df -h

Create a file, for example nano file1.txt and add a line “This is file1.txt”

copy it to the mount path.

cp -v file1.txt /mnt/nfs/s3

Verify the file appears in the bucket. For me, it appeared either immediately or within a few seconds.

Try copying some more files, editing a file, and deleting a file.

To clean up:

Be careful of the order, I have not tried doing it in a different order, but I would not be surprised if there were problems.

Select the Storage gateway, Actions, Delete gateway. It is removed immediately.
From the EC2 console, terminate the File Gateway Appliance, as it is not terminated automatically.
Terminate the NFS Client instance.
Delete the S3 bucket.