AWS Well-Architected pillars
This solution was designed with best practices from the AWS Well-Architected Framework which helps customers design and operate reliable, secure, efficient, and cost-effective workloads in the cloud.
This section describes how the design principles and best practices of the Well-Architected Framework were applied when building this solution.
Operational excellence
This section describes how the principles and best practices of the operational excellence pillar were applied when designing this solution.
The Clickstream Analytics on AWS solution pushes metrics, logs and traces to Amazon CloudWatch at various stages to provide observability into the infrastructure, Elastic load balancer, Amazon ECS cluster, Lambda functions, EMR serverless application, Step Function workflow and the rest of the solution components. This solution also creates the CloudWatch dashboard for each data pipeline.
Security
This section describes how the principles and best practices of the security pillar were applied when designing this solution.
- Clickstream Analytics on AWS web console users are authenticated and authorized with Amazon Cognito or OpenID Connect.
- All inter-service communications use AWS IAM roles.
- All roles used by the solution follows least-privilege access. That is, it only contains minimum permissions required so the service can function properly.
Reliability
This section describes how the principles and best practices of the reliability pillar were applied when designing this solution.
- Using AWS serverless services wherever possible (for example, EMR Serverless, Redshift Serverless, Lambda, Step Functions, Amazon S3, and Amazon SQS) to ensure high availability and recovery from service failure.
- Data ingested by data pipeline is stored in Amazon S3 and Amazon Redshift, so it persists in multiple Availability Zones (AZs) by default.
Performance efficiency
This section describes how the principles and best practices of the performance efficiency pillar were applied when designing this solution.
- The ability to launch this solution in any Region that supports AWS services in this solution such as: Amazon S3, Amazon ECS, Elastic load balancer.
- Using serverless architecture removes the need for you to run and maintain physical servers for traditional compute activities.
- Automatically testing and deploying this solution daily. Reviewing this solution by solution architects and subject matter experts for areas to experiment and improve.
Cost optimization
This section describes how the principles and best practices of the cost optimization pillar were applied when designing this solution.
- Use Autoscaling Group so that the compute costs are only related to how much data is ingested and processed.
- Using serverless services such as Amazon S3, Amazon Kinesis Data Streams, Amazon EMR Serverless and Amazon Redshift Serverless so that customers only get charged for what they use.
Sustainability
This section describes how the principles and best practices of the sustainability pillar were applied when designing this solution.
- The solution's serverless design (using Amazon Kinesis Data Streams, Amazon EMR Serverless, Amazon Redshift Serverless and Amazon QuickSight) and the use of managed services (such as Amazon ECS, Amazon MSK) are aimed at reducing carbon footprint compared to the footprint of continually operating on-premises servers.