In the “Security Engineering on AWS” course there is an overview of Macie.
Amazon Macie is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS.
As usual for these blogs, I assume you have seen a basic introduction, possibly having attended a course, and want to spend a little time getting hands on with the service.
At the time of writing, Macie is only available in selected regions, including N. Virginia.
To get started with Macie, create a bucket in one of the supported regions.
Create some content with some dummy PII data. Here are some ideas:
Create a spreadsheet with a name, address, phone, email, credit card number. Use only fictitious data.
Create an EC2 keypair. Do not use it to launch an instance. Download the pem file to your local machine.
Create a dummy IAM user with CLI credentials. Do not give the user any permissions. Download the credentials.csv file to your local machine.
Select the Macie service and enable it. It shows you the service role it will use. Enabling it takes a few seconds.
Choose Integrations, Add and select your bucket. You will also see a bucket it has created for CloudTrail logs. It uses CloudTrail to log S3 data events in order to analyse the activity to your bucket.
Upload the files to the bucket.
It may take time before any useful data is seen. Maybe leave it for an hour.
Meanwhile, have a look at the Settings menu to see the content types, file extensions, themes and regular expressions that it will use to classify data. Note that at the time of writing, there are some limitations. For now it only works with S3 although it may support other data sources in the future, for example EBS, EFS, RDS, DynamoDB.
The classifications are U.S. centric, for example US format driving licenses are supported but not UK format.
For now you can’t customise things like the regular expressions it uses to classify data.
After leaving it for a while, choose the Alerts menu. In my case, I see an alert to do with the pem file, with a description as follows:
“RSA Private Key uploaded to AWS S3. An RSA key is the private encryption key that will be used to protect sensitive information. Please verify that the storage of credential material in this S3 bucket is in compliance with your organization’s policies and that properly locked down access control mechanisms are in place to protect these credentials”
Try and drill down to how it has classified the data and note that it has used a regex to identify it.
Choose the Research menu and select “s3 Objects” from the drop down list. It should have identified some of the PII and other secret data in the files you uploaded.
Have a look at the pricing and make a decision on whether to disable Macie or leave it enabled. Pricing is based on volume of data and frequency of access.
To clean up, Choose Integrations and remove the bucket from the list.
Choose the logged in username at the top of the screen, Macie General Settings and disable Macie.