Innovation Series, Tech/Engineering

Amazon S3 Security Part 2: Data Confidentiality

In part one of this five-part series, we discussed how we can achieve data archiving in Amazon S3. To follow up, we will discuss how to achieve data confidentiality. We will walk you through some AWS features like S3 data encryption, access analyzer, Amazon Macie, and Amazon CloudTrail data events for Amazon S3 as a data source for Amazon GuardDuty. Please note that these features may or may not be used together based on your organization’s needs.

Encrypting Amazon S3 Data

Amazon S3 data can be protected while in transit (i.e., to and from Amazon S3) and at rest (i.e., at storage disks in Amazon data centers). Data in transit is secured using Secure Socket Layer/Transport Layer Security (SSL/TLS) or client-side encryption. Amazon S3 allows both HTTP and HTTPS requests. You can enforce HTTPS requests and enforce object encryption using specific KMS keys and Amazon S3 bucket policy, which we will discuss in part five of the series. 

For protecting data at rest there are two parts: 

  1. Server-side encryption
  2. Client-side encryption

Server-side encryption allows Amazon S3 to encrypt your object before saving it on disks in the data center and then decrypt it when you download the objects. You have three options for this:

  • Server-side encryption with Amazon S3-managed keys (SSE-S3)
  • Server-side encryption with KMS keys stored in Amazon Key Management Service (SSE-KMS)
  • Server-side encryption with customer-provided keys (SSE-C)

You should consider using the Amazon S3 default keys (KMS console: aws/s3) if you’re uploading or accessing objects using AWS Identity and Access Management (IAM) principals in the same account as the KMS key, and/or if you don’t want to manage policies for the KMS key. However, if you have specific compliance or security requirements, you should consider using a customer-managed key that will allow you to create, rotate, disable, or define access controls for the key and grant cross-account access to your objects. This is because it is possible to configure the policy of a customer-managed key to allow access from another account.

On the other hand, client-side encryption allows you to encrypt data yourself using your own encryption keys and then upload the encrypted data to Amazon S3. You can achieve this in two ways:

  • Use a key stored in Amazon Key Management Service (Amazon KMS)
  • Use a key that you store within your application

Amazon S3 Access Analyzer

Access Analyzer for Amazon S3 is a free service powered by Amazon IAM that alerts you if Amazon S3 buckets are configured to allow access to anyone on the internet or other AWS accounts, including AWS accounts outside of your organization. Examples include: when an Amazon S3 bucket has read or write access provided through a bucket access control list (ACL), a bucket policy, a multi-region access point policy, or an access point policy. For enabling Access Analyzer for Amazon S3, you would need to create an account-level IAM analyzer for each region from the AWS IAM console.

Once the IAM Access Analyzer is enabled, go to the Amazon S3 console and check the “Access Analyzer for Amazon S3” tab. For example, I have a test Amazon S3 bucket that has been granted read access to another of my AWS accounts (external) via bucket policy. 

Additionally, if you want to block all access to a bucket with a single click, you can do so from within the Access Analyzer. Blocking public access to a bucket does not take effect if you have provided cross-account Amazon S3 access. When you apply the block public access setting to an account, the settings apply to all AWS regions globally. Amazon S3 considers a bucket public if it grants any permissions to members of the predefined “AllUsers” or “AuthenticatedUsers” groups. When evaluating a bucket policy, Amazon S3 first starts by assuming that the policy is public before evaluating the policy to determine whether it qualifies as non-public. To be considered non-public, a bucket policy must grant access only to fixed values (i.e., values that don’t contain a wildcard or IAM variable). For the complete list of the values, please refer to official AWS documentation. 

Now let’s discuss how a security team can action the Access Analyzer findings and operationalize them. You should first start by reviewing and changing bucket access if necessary. If indeed the bucket requires access to the public or other AWS accounts, including accounts outside of your organization, you can archive the finding for the bucket. You can also download the Access Analyzer report as a CSV file for further analysis. 

You can also create Access Analyzer rules based on certain conditions, for example, another account within your organization to remediate the finding. Additionally, you can set up an Amazon CloudWatch event rule that triggers IAM Access Analyzer findings. The event rule can be used to trigger notifications or remediation actions using AWS Lambda that triggers an AWS Config rule.

AWS Macie

Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in AWS. To enable Amazon Macie for your AWS account, go to the “Amazon Macie” console and click “Enable Macie” from the welcome page. You can disable or suspend Amazon Macie later. Once you enable Macie, you can create various jobs based on your organization’s requirements. For example, we can create a rule that will mark all files in an important Amazon S3 bucket that hosts logs and flags if it contains any objects with personal information.


Alternatively, you can also set up a one-time scan (on-demand) for audits to find certain file types or extensions — especially useful for compliance requirements. This can be run against one or multiple Amazon S3 buckets and certain object filters as well. Amazon Macie will give you the total cost as well before you run the scan. For example: 



Amazon Macie will show all findings in one console. For example:


Amazon Macie has a generic feature known as “managed data identifiers” that helps you to discover various types of sensitive data. For customers interested in leveraging granular customizations, Macie offers “custom data identifiers” which is helpful to find specific sensitive data for example proprietary data or intellectual property, based on specific needs. We’ll show you a step-by-step walkthrough of how to define and run custom data identifiers to automatically discover specific sensitive data. 

First, let’s go to the Amazon Macie console. Choose “custom data identifiers.”

Next, select “create.”

Let’s just assume that we have various top-secret files that are classified each year. Let’s also assume that each top secret file will have words that begin with “TS” followed by the year “YYYY” i.e. 4 digit numbers for the year the file was classified, for example, TS-2019, TS-2020, TS-2022, etc. In this example, we’ll upload them to the Amazon S3 bucket.


Once we fill required fields, let’s submit the rule. 

Once the rule has been created, let’s try scanning the Amazon S3 bucket where the files containing “TS-YYYY” were deliberately planted. Let’s create a “one-time” job. Please do note that, alternatively, you can also schedule the job to be triggered based on your organization’s needs. 

Make sure you select the custom data identifier.


Once the job has been created, wait for the scan to complete.

I was able to see that the planted files were marked by the scan so that the results can be actioned as required. Additionally, you can also export the findings as JSON for further analysis and records.



Planted files:

AWS CloudTrail data events for Amazon S3 as a data source for GuardDuty

Data events, aka data plane operations, are often high-volume activities for resource-level operations, for examples CloudTrail API data events for Amazon S3 that GuardDuty can monitor would be any events on object level like GetObject, PutObject, ListObjects API and DeleteObject. Data event monitoring is enabled by default for new accounts. Amazon S3 protection enables Amazon GuardDuty to monitor object-level API operations and identify potential security risks for data within your Amazon S3 buckets. GuardDuty monitors threats against your resources by analyzing AWS CloudTrail management events and CloudTrail S3 data events. CloudTrail S3 data event logs are a configurable data source in GuardDuty. 

To enable Amazon S3 protection in GuardDuty, open the GuardDuty console. In the navigation pane under settings, choose Amazon S3 protection and click “enable.”


After you enable S3 protection, it will look like this:

How can a security team action GuardDuty findings?

Findings with a data source of CloudTrail data events for S3 are only generated if you have S3 protection enabled for GuardDuty. If you observe any Amazon S3 bucket-related findings, it is recommended that you thoroughly revise the permissions on the bucket as well as the permissions of any users or roles involved in the finding. It’s crucial that you identify the source of any suspicious activity and the API call used, which will be listed as API in the finding details. The source will usually be an IAM principal (either an IAM user, role, or account), and identifying details will be listed in the finding. Depending on the information on source type/domain, remote IP address, the security team can make decisions on whether the source was authorized or not. 

As part of the security team’s investigation, it is crucial to determine whether the call source was authorized to access the identified resource. For example, if it was an IAM user in the alert involved in the findings, it is possible the credentials could have been compromised. If the access was authorized you can ignore the finding. The Amazon GuardDuty console allows you to set up rules to suppress individual findings so they no longer appear. 

Additionally, if your security team determines that the Amazon S3 data has been exposed or accessed by an unauthorized party, you must review the Amazon S3 security checks and restrict access. Please note that the appropriate remediation solutions will depend on your organization and business needs. 

Some examples of additional actionable security recommendations:

  1. As mentioned previously in this article, consider using the helpful and powerful Amazon S3 block public access settings on buckets, preferably the whole AWS account level. 
  2. Leverage Amazon S3 bucket policies to restrict access to specific VPC endpoints. 
  3. To temporarily allow access to your S3 objects to trusted entities outside your account, you can leverage the presigned URL through Amazon S3. Similarly, if you require sharing of Amazon S3 objects between different sources you can use access points to create permission sets that restrict access to only those within your private network. 

Summary 

We can use various Amazon S3 features for data confidentiality like encrypting Amazon S3 data, Amazon S3 Access Analyzer, AWS Macie, AWS CloudTrail data events for Amazon S3 as a data source for GuardDuty. We walked you through steps to enable each of the features. In the next part of the series, ie Part 3 of 5, we discuss how you can achieve data integrity using Amazon S3.

Next steps

Please keep an eye out for the next part of this series. You can also learn more about the technical innovations and best practices powering cloud backup and data management. Visit the Innovation Series section of Druva’s blog archive.

About the author

I have been in the cloud tech world since 2015, wearing multiple hats and working as a consultant to help customers architect their cloud journey. I joined Druva four years ago as a cloud engineer. Currently, I lead Druva’s cloud security initiatives, roadmap, and planning. I love to approach cloud security pragmatically because I strongly believe that the most important component of security is the humans behind the systems. 

Find me on LinkedIn: https://www.linkedin.com/in/aashish-aj/

Email: aashish.aacharya@druva.com