Auditing and monitoring S3 access
At Pattern Match we strongly believe in the monitoring of any kind. Nowadays, data security is more important than before. Everyone is scared about data leaks. Even the biggest kids in block ;) So you want to know what or who is accessing your data which you keep in S3. Keep in mind that we are not talking about public buckets only. You may want to audit your private bucket as well.
So what options do we have:
- setup S3 server access logging
- setup CloudTrail object level logging
Let’s dive into details.
S3 server access logging
This is the most basic option. Setup is described here. It is very simple and you pay only for the logs you store, so it is cheap.
After you setup server access logging, please remember that first logs may appear after a couple of minutes or even hours(!).
Pattern Match best practices: We suggest to setup a separated bucket to put logs from all other buckets there. In that case, we recommend using prefix option to guarantee proper granularity of the logs. It is good to set the prefix option the same as anme of the monitored bucket itself.
Check the following example: for the bucket
test-bucket.pattern-match.com we setup logging to
access-logs.pattern-match.com bucket with prefix set as bucket name i.e.
Warning: this functionality is not ideal when you need reliable logs. The completeness and timeliness of server logging is not guaranteed. The log record for a particular request might be delivered long after the request was processed, or it might not be delivered at all. More details can be found here.
So what is logged? Let’s preview example log entry:
We are not going through the whole log entry as it is boring and you can check details here. However, you should check second to last entry which is
s3.eu-west-1.amazonaws.com. This value is named
Host Header and this value is the endpoint used to connect to Amazon S3. Why is it important? We will answer this question in the last section of this blog post.
Setup CloudTrail object level logging
This is an alternative solution to server logs described earlier. The biggest advantages over server logs are:
- reliable delivery in 5-15 minutes
- turn on logs for a subset of objects in the bucket via prefix
- you can forward logs to CloudWatch Events so you can setup more sophisticated monitoring
In terms of costs, CloudTrail based solution is more expensive. The reason is that CloudTrail Data Events are not free. At this moment Amazon charges $0.1 per 100,000 events. Additionally, you also have to pay for S3 usage of produced logs, which we also have in server logging.
What is logged in CloudTrail Data Events
Setting up CloudTrail Data Events for S3 bucket is more complex than simple server logging. The whole process is well documented in AWS documentation.
So, assuming you already setup everything, let’s check how log entry looks like:
Fields are self-explanatory. Interesting part for us is following snippet:
It shows which object from which bucket has been accessed. Another request may look like:
You probably already spot the difference. This info in
Host is pretty important. Let’s check out out why.
S3 path deprecation
In May 2019 Amazon team announced that they want to deprecate S3 path-style model. In a few words AWS wants to drop support for the following URL:
in favour of:
According to the current plan, they will keep support already existing buckets and resources. However, all new buckets and resources created after September 30, 2020 must be referenced using the virtual-hosted model (latter example). This is partially supported at this moment. For example, when you visit this URL you will get SSL exception as SSL cert does not support multi-level domain.
My personal opinion is that AWS will change URL from
They will replace dots with dashes so wildcard SSL cert
*.s3.amazonaws.com will work in that case.
How does path deprecation affect my project
After reading this blog post, we hope you will be able to answer this question by yourself. The most important part is to check how you, your users or scripts access your S3 resources. If you need help with it, let’s keep in touch. We can setup monitoring and auditing for you.