blog
How to Use Amazon Macie
03 October 22

How to Use Amazon Macie

Posted byINE
facebooktwitterlinkedin
news-featured

In our lab walkthrough series, we go through selected lab exercises on our INE Platform. Subscribe or sign up for a 7-day, risk-free trial with INE and access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

Purpose: When storing data in the S3 bucket, it is critical that we never store any sensitive data in it. If the bucket is having the public access, it becomes vulnerable to attackers. As a result, it is essential to have a service that can detect potential leaks of sensitive data into the S3 bucket. Amazon Macie comes in handy here. In this article, we will learn how to use the Amazon Macie service to find the sensitive data leak into the S3 bucket.

Technical difficulty:

|   Novice   |   Beginner   |   Competent   |   Proficient   |   Expert

What is Amazon Macie?

Amazon Macie is a fully managed data security and privacy solution that uses machine learning and pattern matching to assist you in discovering, monitoring, and protecting sensitive data in your AWS environment.

Amazon Macie1.png

Macie automates the detection of sensitive data, such as personally identifiable information (PII) and financial data, to provide you a better knowledge of the data stored in Amazon Simple Storage Service (Amazon S3). Macie also keeps an inventory of your S3 buckets and automatically reviews and monitors them for security and access control. Amazon Macie can also detect and report excessively permissive or unencrypted buckets.

What are Amazon Macie Findings?

When Amazon Macie finds potential policy breaches or issues with the security or privacy of your S3 buckets, or when sensitive data is discovered in S3 objects, it generates findings. A finding is a thorough report on a potential issue or sensitive data discovered by Macie. Each finding includes a severity assessment, information about the affected resource, and extra information, such as when and how Macie discovered the issue or data. Macie stores your policy and sensitive data discoveries for 90 days.

Types of Amazon Macie findings

Amazon Macie generates two categories of findings:

  1. Policy findings

  2. Sensitive data findings

Policy finding is a detailed report of a potential policy violation or security or privacy issue with an Amazon S3 bucket. These findings are generated by Macie as part of its ongoing monitoring of your Amazon S3 data.

Sensitive data finding is a comprehensive report on sensitive data found in an S3 object. When Macie discovers sensitive data in S3 objects that you configure a sensitive data discovery job to analyze, it generates these findings. Each category contains a different type of finding.

Lab Workflow

In this lab, we will create an S3 bucket and store sensitive information in it as a JSON file, and then use the Amazon Macie service and regular expressions to try to find the sensitive data findings.

Now that we have covered all the key terms for the lab, let's carry out the experiment.

Lab Scenario

We have set up the below scenario in our INE labs for our students to practice. The screenshots have been taken from our online lab environment.

Lab Link: Amazon Macie

Objective

Store the sensitive data into S3 bucket and use Amazon Macie to generate the sensitive data findings.

Solution

Step 1: Click the lab link button to get access credentials. Login to the AWS account with these credentials.

Amazon Macie3.png


Step 2: Create S3 bucket and upload sensitive data. Search for S3 in the search bar and navigate to the S3 dashboard.

Amazon Macie4.png


Step 3: Click on the “Create bucket” button.

Amazon Macie5.png


Step 4: Set the bucket name as "student-lab-bucket-" and append the account id at the end.

Amazon Macie6.png


Step 5: Enable ACLs and set the object ownership to “Object writer”.

Amazon Macie7.png


Step 6: Uncheck the “Block all public access” and make the bucket public.

Amazon Macie8.png

Confirm the action by checking the acknowledging the current settings.

Amazon Macie9.png


Click on the “Create bucket” button.

Amazon Macie10.png


Successfully created the bucket.

Amazon Macie11.png


There are no objects available in the bucket. Upload files by clicking the “Upload” button.

Amazon Macie12.png


Step 7: Create a JSON file and set name as “data.json”.

Command: nano data.json

Amazon Macie13.png

Step 8: Copy and paste the following code inside the data.json file.

Code :

[
  {
    "id": 1,
    "jobTitleName": "Developer",
    "firstName": "Romin",
    "lastName": "Irani",
    "preferredFullName": "Romin Irani",
    "employeeCode": "ANC-1790",
    "region": "CA"
  },
  {
    "id": 2,
    "jobTitleName": "Developer",
    "firstName": "Neil",
    "lastName": "Irani",
    "preferredFullName": "Neil Irani",
    "employeeCode": "AEF-2351",
    "region": "CA"
  }
]

This is sample employee information. We are using employee code as sensitive information and detecting it with Amazon Macie.

Amazon Macie14.png

Step 9: Choose the “data.json” file to upload. 

Amazon Macie15.png

Click on the “Upload” button.

Amazon Macie16.png


Step 10: Click on “Permissions” inside the created bucket.

Amazon Macie17.png

Step 11: Click on “Edit” in the ACL block.

Amazon Macie18.png

Enable public read access.

Amazon Macie19.png


Confirm the action by checking the acknowledging the current settings.

Amazon Macie20.png

Now the bucket is publicly accessible.

Amazon Macie21.png

Step 12: Search for macie in the search bar and navigate to the Amazon Macie dashboard.

Amazon Macie22.png

Here we will create a custom data identifier where we will set a regular expression that matches the pattern of data present in the S3 bucket.

Click on the “Get started” button.

Amazon Macie23.png

Step 13: Click on the “Enable Macie” button.

Amazon Macie24.png

As soon as Macie is enabled, it will automatically discover all the buckets and objects that are stored inside each bucket, and the Macie dashboard will appear based on the size and count of the buckets.

Amazon Macie25.png

Step 14: Click on the “Create job” button.

A sensitive data discovery job is a series of automated processing and analysis tasks that Macie performs to analyze objects in S3 buckets and determine whether the objects contain sensitive data.

Amazon Macie26.png

Step 15: For the Refine the scope step, choose One-time job, and then choose Next.

Amazon Macie27.png

Step 16: Select the created S3 bucket.

Amazon Macie28.png

Click on the “Next” button.

Amazon Macie29.png

Review S3 bucket settings.

Amazon Macie30.png

Click on the “Next” button.

Amazon Macie31.png


Step 17: Click on the arrow to expand the window of Additional settings.

Amazon Macie32.png


Step 18: Let the Object criteria be default as File name extensions. Enter “json” in the textbox and click on the Include button.

Amazon Macie can analyze data in many different formats, including commonly used compression and archive formats.

Amazon Macie33.png


Successfully included the file extension “JSON”.

Amazon Macie34.png


Click on the “Next” button.

Amazon Macie35.png


Step 19: Set selection type as “All”.

Amazon Macie36.png

Click on the “Next” button.

Amazon Macie37.png


Step 20: Create a custom identifier to find the sensitive data from the json file. Click on “Manage custom identifier”.

Amazon Macie38.png


A custom data identifier is a set of criteria that you define to detect sensitive data. The criteria consist of a regular expression (regex) that defines a text pattern to match and, optionally, character sequences and a proximity rule that refine the results.

Step 21: Click on the “Create” button.

Amazon Macie39.png



Step 22: Set the identifier name as  “EmployeeCodeIdentifier”.

Amazon Macie40.png

Step 23: Copy and paste the following regular expression to match the sensitive data in the file.

Regular expression: [a-z]{3}-[0-9]{4}

This identifier finds the data present in the format of ABC-0123  i.e. three characters, dash and followed by four numbers.

Amazon Macie41.png


Click on “Submit”.

Amazon Macie42.png


Review the settings and click on “Submit” again.

Amazon Macie43.png

Successfully created custom identifier.

Amazon Macie44.png

Step 24: Navigate back to the job creation stage and click on the refresh button.

Amazon Macie45.png

Step 25: Now select the created custom identifier.

Amazon Macie46.png

Click on the “Next” button.

Amazon Macie47.png

Keep the allow lists as empty. With allow lists in Amazon Macie, you can define specific text and text patterns that you want Macie to ignore when it inspects Amazon S3 objects for sensitive data.

Amazon Macie48.png

Click on the “Next” button.

Amazon Macie49.png


Step 26: Enter the job name as “DataIdentification”.

Amazon Macie50.png

Now click on the “Next” button.

Amazon Macie51.png


Now click on the “Submit” button.

Amazon Macie52.png

Successfully created a macie job.

Amazon Macie53.png

Step 27: Click on “Findings”.

Amazon Macie54.png

If Macie discovers sensitive data in an object, Macie creates a sensitive data finding. A sensitive data finding is a detailed report of sensitive data that Macie found in an object.

Step 28: Select the finding with the type “SensitiveData:S3Object/Personal”.

Amazon Macie55.png

Sensitive data finding indicates that the object contains personally identifiable information (such as full names or mailing addresses), personal health information (such as health insurance or medical identification numbers), or a combination of the two. In our case the sensitive data is the employee code.

Amazon Macie56.png


Step 29: Select the finding and click on “Export(JSON) under Actions”.

Amazon Macie57.png

The complete detail of the finding will be available in the JSON.

Amazon Macie58.png


References: 

Conclusion

Congratulations! We learnt how to store the sensitive data into S3 bucket and use Amazon Macie to generate the sensitive data findings.

Try out Amazon Macie hands-on in our lab! Subscribe or sign up for a 7-day, risk-free trial with INE to access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

Need training for your entire team?

Schedule a Demo

Hey! Don’t miss anything - subscribe to our newsletter!

© 2022 INE. All Rights Reserved. All logos, trademarks and registered trademarks are the property of their respective owners.
instagram Logofacebook Logotwitter Logolinkedin Logoyoutube Logo