Basic configuration
For the clickstream project, you can specify the basic configuration of the data pipeline and set the following configurations:
- AWS Region: select a region at which the pipeline will be created. If you select a region where some AWS services are not available, the corresponding feature will be disabled by default. Check the region table for feature availability.
-
VPC: specify the VPC where the compute resources of the pipeline will be located. The VPC needs to meet the below criteria for running the pipeline workload.
- At least two public subnets across two availability zones (AZs) in the VPC.
- At least two private (with NAT gateways or instances) subnets across two AZs, or at least two isolated subnets across two AZs in the VPC. If you want to deploy the solution resources in the isolated subnets, you have to create VPC endpoints for the AWS services below.
s3
,logs
,ecr.api
,ecr.dkr
,ecs
,ecs-agent
,ecs-telemetry
.kinesis-streams
if you use KDS as a sink buffer in the ingestion module.emr-serverless
,glue
if you enable the data processing module.redshift-data
,sts
,dynamodb
(must be Gateway endpoints for DynamoDB),states
andlambda
if you enable Redshift as an analytics engine in the data modeling module.
* Data collection SDK: specify the SDK type that the client uses.
- If you choose Clickstream SDK, please refer to SDK manual for the available Clickstream SDKs and integration guides.
- If you choose Third-Party SDK, you need to follow up on this step to add a custom transformer plug-in to map the data to solution schema (if data modeling & reporting are needed). Note that the solution has built-in support for Google Tag Manager for server-side tagging, you can follow up on Guidance for Using Google Tag Manager for Server-Side Website Analytics on AWS to set up the GTM server-side servers on AWS.
-
Data location: specify the S3 bucket where the clickstream data is stored.
Note
The bucket encrypted with AWS KMS keys (SSE-KMS) is not supported.
-
Tags: specify the additional tags for the AWS resources created by the solution.
Note
Three built-in tags managed by the solution cannot be changed or removed.