AnalyticsBucket
Amazon S3 Bucket configured for analytics.
Overview
AnalyticsBucket
is an Amazon S3 Bucket configured with the following best-practices and defaults for analytics:
- The bucket name is in the form of
<BUCKET_NAME>-<CDK_ID>-<AWS_ACCOUNT_ID>-<AWS_REGION>-<UNIQUEID>
- Server side bucket encryption managed by KMS customer key. You need to provide a KMS Key
- SSL communication enforcement.
- Access logged to an S3 bucket within a prefix matching the bucket name. By default, store logs in itself.
- All public access blocked.
- Two-step protection for bucket and objects deletion.
AnalyticsBucket
extends the Amazon S3 Bucket
CDK Construct. For custom requirements that are not covered, use the Bucket
construct directly.
Usage
- TypeScript
- Python
class ExampleDefaultAnalyticsBucketStack extends cdk.Stack {
constructor(scope: Construct, id: string) {
super(scope, id);
const key = new Key(this, 'DataKey', {
enableKeyRotation: true
});
new dsf.storage.AnalyticsBucket(this, 'AnalyticsBucket', {
encryptionKey: key
});
}
}
class ExampleDefaultAnalyticsBucketStack(cdk.Stack):
def __init__(self, scope, id):
super().__init__(scope, id)
key = Key(self, "DataKey",
enable_key_rotation=True
)
dsf.storage.AnalyticsBucket(self, "AnalyticsBucket",
encryption_key=key
)
Objects removal
You can specify if the bucket and objects should be deleted when the CDK resource is destroyed using removalPolicy
. Refer to the DataLakeStorage documentation
If no access logs bucket is configure, the AnalyticsBucket
stores its access logs in itself.
This prevents from properly deleting the resources when the removalPolicy
is set to DESTROY
because it creates an edge case where the deletion of the S3 Objects recreates Objects corresponding the access logs. The S3 Bucket is then never empty and the deletion of the Bucket fails.
Bucket Naming
The construct ensures the default bucket name uniqueness which is a pre-requisite to create Amazon S3 buckets.
To achieve this, the construct is creating the default bucket name like accesslogs-<AWS_ACCOUNT_ID>-<AWS_REGION>-<UNIQUEID>
where:
<AWS_ACCOUNT_ID>
and<AWS_REGION>
are the account ID and region where you deploy the construct.<UNIQUEID>
is an 8 characters unique ID calculated based on the CDK path.
If you provide the bucketName
parameter, you need to ensure the name is globaly unique.
Alternatively, you can use the BucketUtils.generateUniqueBucketName()
utility method to create unique names.
This method generates a unique name based on the provided name, the construct ID and the CDK scope:
- The bucket name is suffixed the AWS account ID, the AWS region and an 8 character hash of the CDK path.
- The maximum length for the bucket name is 26 characters.
- TypeScript
- Python
new dsf.storage.AnalyticsBucket(this, 'AnalyticsBucket', {
bucketName: dsf.utils.BucketUtils.generateUniqueBucketName(this, 'AnalyticsBucket', 'my-custom-name'),
encryptionKey: key
});
dsf.storage.AnalyticsBucket(self, "AnalyticsBucket",
bucket_name=dsf.utils.BucketUtils.generate_unique_bucket_name(self, "AnalyticsBucket", "my-custom-name"),
encryption_key=key
)