Resources
    How to Use Amazon Macie
    03 October 22

    How to Use Amazon Macie

    Posted byINE
    facebooktwitterlinkedin
    news-featured

    In our lab walkthrough series, we go through selected lab exercises on our INE Platform. Subscribe or sign up for a 7-day, risk-free trial with INE and access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

    Purpose: When storing data in the S3 bucket, it is critical that we never store any sensitive data in it. If the bucket is having the public access, it becomes vulnerable to attackers. As a result, it is essential to have a service that can detect potential leaks of sensitive data into the S3 bucket. Amazon Macie comes in handy here. In this article, we will learn how to use the Amazon Macie service to find the sensitive data leak into the S3 bucket.

    Technical difficulty:

    |   Novice   |   Beginner   |   Competent   |   Proficient   |   Expert

    What is Amazon Macie?

    Amazon Macie is a fully managed data security and privacy solution that uses machine learning and pattern matching to assist you in discovering, monitoring, and protecting sensitive data in your AWS environment.

    Amazon Macie1.png

    Macie automates the detection of sensitive data, such as personally identifiable information (PII) and financial data, to provide you a better knowledge of the data stored in Amazon Simple Storage Service (Amazon S3). Macie also keeps an inventory of your S3 buckets and automatically reviews and monitors them for security and access control. Amazon Macie can also detect and report excessively permissive or unencrypted buckets.

    What are Amazon Macie Findings?

    When Amazon Macie finds potential policy breaches or issues with the security or privacy of your S3 buckets, or when sensitive data is discovered in S3 objects, it generates findings. A finding is a thorough report on a potential issue or sensitive data discovered by Macie. Each finding includes a severity assessment, information about the affected resource, and extra information, such as when and how Macie discovered the issue or data. Macie stores your policy and sensitive data discoveries for 90 days.

    Types of Amazon Macie findings

    Amazon Macie generates two categories of findings:

    1. Policy findings

    2. Sensitive data findings

    Policy finding is a detailed report of a potential policy violation or security or privacy issue with an Amazon S3 bucket. These findings are generated by Macie as part of its ongoing monitoring of your Amazon S3 data.

    Sensitive data finding is a comprehensive report on sensitive data found in an S3 object. When Macie discovers sensitive data in S3 objects that you configure a sensitive data discovery job to analyze, it generates these findings. Each category contains a different type of finding.

    Lab Workflow

    In this lab, we will create an S3 bucket and store sensitive information in it as a JSON file, and then use the Amazon Macie service and regular expressions to try to find the sensitive data findings.

    Now that we have covered all the key terms for the lab, let's carry out the experiment.

    Lab Scenario

    We have set up the below scenario in our INE labs for our students to practice. The screenshots have been taken from our online lab environment.

    Lab Link: Amazon Macie

    Objective

    Store the sensitive data into S3 bucket and use Amazon Macie to generate the sensitive data findings.

    Solution

    Step 1: Click the lab link button to get access credentials. Login to the AWS account with these credentials.

    Amazon Macie3.png


    Step 2: Create S3 bucket and upload sensitive data. Search for S3 in the search bar and navigate to the S3 dashboard.

    Amazon Macie4.png


    Step 3: Click on the “Create bucket” button.

    Amazon Macie5.png


    Step 4: Set the bucket name as "student-lab-bucket-" and append the account id at the end.

    Amazon Macie6.png


    Step 5: Enable ACLs and set the object ownership to “Object writer”.

    Amazon Macie7.png


    Step 6: Uncheck the “Block all public access” and make the bucket public.

    Amazon Macie8.png

    Confirm the action by checking the acknowledging the current settings.

    Amazon Macie9.png


    Click on the “Create bucket” button.

    Amazon Macie10.png


    Successfully created the bucket.

    Amazon Macie11.png


    There are no objects available in the bucket. Upload files by clicking the “Upload” button.

    Amazon Macie12.png


    Step 7: Create a JSON file and set name as “data.json”.

    Command: nano data.json

    Amazon Macie13.png

    Step 8: Copy and paste the following code inside the data.json file.

    Code :

    [
      {
        "id": 1,
        "jobTitleName": "Developer",
        "firstName": "Romin",
        "lastName": "Irani",
        "preferredFullName": "Romin Irani",
        "employeeCode": "ANC-1790",
        "region": "CA"
      },
      {
        "id": 2,
        "jobTitleName": "Developer",
        "firstName": "Neil",
        "lastName": "Irani",
        "preferredFullName": "Neil Irani",
        "employeeCode": "AEF-2351",
        "region": "CA"
      }
    ]

    This is sample employee information. We are using employee code as sensitive information and detecting it with Amazon Macie.

    Amazon Macie14.png

    Step 9: Choose the “data.json” file to upload. 

    Amazon Macie15.png

    Click on the “Upload” button.

    Amazon Macie16.png


    Step 10: Click on “Permissions” inside the created bucket.

    Amazon Macie17.png

    Step 11: Click on “Edit” in the ACL block.

    Amazon Macie18.png

    Enable public read access.

    Amazon Macie19.png


    Confirm the action by checking the acknowledging the current settings.

    Amazon Macie20.png

    Now the bucket is publicly accessible.

    Amazon Macie21.png

    Step 12: Search for macie in the search bar and navigate to the Amazon Macie dashboard.

    Amazon Macie22.png

    Here we will create a custom data identifier where we will set a regular expression that matches the pattern of data present in the S3 bucket.

    Click on the “Get started” button.

    Amazon Macie23.png

    Step 13: Click on the “Enable Macie” button.

    Amazon Macie24.png

    As soon as Macie is enabled, it will automatically discover all the buckets and objects that are stored inside each bucket, and the Macie dashboard will appear based on the size and count of the buckets.

    Amazon Macie25.png

    Step 14: Click on the “Create job” button.

    A sensitive data discovery job is a series of automated processing and analysis tasks that Macie performs to analyze objects in S3 buckets and determine whether the objects contain sensitive data.

    Amazon Macie26.png

    Step 15: For the Refine the scope step, choose One-time job, and then choose Next.

    Amazon Macie27.png

    Step 16: Select the created S3 bucket.

    Amazon Macie28.png

    Click on the “Next” button.

    Amazon Macie29.png

    Review S3 bucket settings.

    Amazon Macie30.png

    Click on the “Next” button.

    Amazon Macie31.png


    Step 17: Click on the arrow to expand the window of Additional settings.

    Amazon Macie32.png


    Step 18: Let the Object criteria be default as File name extensions. Enter “json” in the textbox and click on the Include button.

    Amazon Macie can analyze data in many different formats, including commonly used compression and archive formats.

    Amazon Macie33.png


    Successfully included the file extension “JSON”.

    Amazon Macie34.png


    Click on the “Next” button.

    Amazon Macie35.png


    Step 19: Set selection type as “All”.

    Amazon Macie36.png

    Click on the “Next” button.

    Amazon Macie37.png


    Step 20: Create a custom identifier to find the sensitive data from the json file. Click on “Manage custom identifier”.

    Amazon Macie38.png


    A custom data identifier is a set of criteria that you define to detect sensitive data. The criteria consist of a regular expression (regex) that defines a text pattern to match and, optionally, character sequences and a proximity rule that refine the results.

    Step 21: Click on the “Create” button.

    Amazon Macie39.png



    Step 22: Set the identifier name as  “EmployeeCodeIdentifier”.

    Amazon Macie40.png

    Step 23: Copy and paste the following regular expression to match the sensitive data in the file.

    Regular expression: [a-z]{3}-[0-9]{4}

    This identifier finds the data present in the format of ABC-0123  i.e. three characters, dash and followed by four numbers.

    Amazon Macie41.png


    Click on “Submit”.

    Amazon Macie42.png


    Review the settings and click on “Submit” again.

    Amazon Macie43.png

    Successfully created custom identifier.

    Amazon Macie44.png

    Step 24: Navigate back to the job creation stage and click on the refresh button.

    Amazon Macie45.png

    Step 25: Now select the created custom identifier.

    Amazon Macie46.png

    Click on the “Next” button.

    Amazon Macie47.png

    Keep the allow lists as empty. With allow lists in Amazon Macie, you can define specific text and text patterns that you want Macie to ignore when it inspects Amazon S3 objects for sensitive data.

    Amazon Macie48.png

    Click on the “Next” button.

    Amazon Macie49.png


    Step 26: Enter the job name as “DataIdentification”.

    Amazon Macie50.png

    Now click on the “Next” button.

    Amazon Macie51.png


    Now click on the “Submit” button.

    Amazon Macie52.png

    Successfully created a macie job.

    Amazon Macie53.png

    Step 27: Click on “Findings”.

    Amazon Macie54.png

    If Macie discovers sensitive data in an object, Macie creates a sensitive data finding. A sensitive data finding is a detailed report of sensitive data that Macie found in an object.

    Step 28: Select the finding with the type “SensitiveData:S3Object/Personal”.

    Amazon Macie55.png

    Sensitive data finding indicates that the object contains personally identifiable information (such as full names or mailing addresses), personal health information (such as health insurance or medical identification numbers), or a combination of the two. In our case the sensitive data is the employee code.

    Amazon Macie56.png


    Step 29: Select the finding and click on “Export(JSON) under Actions”.

    Amazon Macie57.png

    The complete detail of the finding will be available in the JSON.

    Amazon Macie58.png


    References: 

    Conclusion

    Congratulations! We learnt how to store the sensitive data into S3 bucket and use Amazon Macie to generate the sensitive data findings.

    Try out Amazon Macie hands-on in our lab! Subscribe or sign up for a 7-day, risk-free trial with INE to access this lab and a robust library covering the latest in Cyber Security, Networking, Cloud, and Data Science!

    © 2024 INE. All Rights Reserved. All logos, trademarks and registered trademarks are the property of their respective owners.
    instagram Logofacebook Logotwitter Logolinkedin Logoyoutube Logo