DataZoneMskAuthorizer
Custom DataZone MSK authorizer for granting access to MSK topics via DataZone asset subscription workflow.
Overview
The DataZone MSK Authorizer is a custom process integrated with DataZone that implements the Subscription Grant concept for Kafka topics hosted on Amazon MSK (provisioned and Serverless),
secured by IAM policies, and registered in DataZone using the DataZoneMskAssetType
.
It supports:
- cross account access with MSK Provisioned clusters.
- MSK managed VPC connectivity permissions with MSK Provisioned clusters
- Glue Schema Registry permissions when sharing in the same account
The authorizer is composed of 2 constructs:
- the
DataZoneMskCentralAuthorizer
is responsible for collecting metadata on the Subscription Grant, orchestrating the workflow and acknowledging the Subscription Grant creation. This construct must be deployed in the AWS root account of the DataZone Domain. - the
DataZoneMskEnvironmentAuthorizer
is responsible for managing the permissions on the producer and consumer side. This construct must be deployed once per account associated with the DataZone Domain.
The cross-account synchronization is exclusively done via EventBridge bus to restrict cross account permissions to the minimum.
DataZoneMskCentralAuthorizer
The DataZoneMskCentralAuthorizer
is the central component that receives all the Subscription Grant Requests from DataZone for the MskTopicAssetType
and orchestrate the end-to-end workflow.
The workflow is a Step Functions State Machine that is triggered by events emmitted by DataZone and contains the following steps:
- Metadata collection: a Lambda Function collect additional information from DataZone on the producer, the subscriber and update the status of the Subscription Grant to
IN_PROGESS
. - Producer grant trigger: an event is sent to the producer account to request the creation of the grant on the producer MSK cluster (implemented in the
DataZoneMskEnvironmentAuthorizer
). This step is an asynchronous state using a callback mechanism from theDataZoneMskEnvironmentAuthorizer
. - Consumer grant trigger: an event is sent to the consumer account to request the creation of the grant on the IAM consumer Role (implemented in the
DataZoneMskEnvironmentAuthorizer
). This step is an asynchronous state using a callback mechanism from theDataZoneMskEnvironmentAuthorizer
. - DataZone Subscription Grant callback: a Lambda Function updates the status of the Subscription Grant in DataZone to
GRANTED
orREVOKE
based on the initial request.
If any failure happens during the process, the Step Functions catch the exceptions and updates the status of the Subscription Grant to GRANT_FAILED
or REVOKE_FAILED
.
If the grant fails for the consumer, the grant already done for the producer is not reverted but the user is notified within DataZone because the failure is propagated. The authorizer process is idempotent so it's safe to replay the workflow and all the permissions will be deduplicated. If it's not replayed, the producer grant needs to be manually cleaned up.
Usage
- TypeScript
- Python
new dsf.governance.DataZoneMskCentralAuthorizer(this, 'MskAuthorizer', {
domainId: 'aba_dc999t9ime9sss',
});
dsf.governance.DataZoneMskCentralAuthorizer(self, "MskAuthorizer",
domain_id="aba_dc999t9ime9sss"
)
Register producer and consumer accounts
The DataZoneMskCentralAuthorizer
construct work in collaboration with the DataZoneMskEnvironmentAuthorizer
construct which is deployed into the producers and consumers accounts.
To enable the integration, register accounts using the registerAccount()
method on the DataZoneMskCentralAuthorizer
object.
It will grant the required permissions so the central account and the environment accounts can communicate via EventBridge events.
- TypeScript
- Python
const centralAuthorizer = new dsf.governance.DataZoneMskCentralAuthorizer(this, 'MskAuthorizer', {
domainId: 'aba_dc999t9ime9sss',
});
// Add an account that is associated with the DataZone Domain
centralAuthorizer.registerAccount('AccountRegistration', '123456789012');
central_authorizer = dsf.governance.DataZoneMskCentralAuthorizer(self, "MskAuthorizer",
domain_id="aba_dc999t9ime9sss"
)
# Add an account that is associated with the DataZone Domain
central_authorizer.register_account("AccountRegistration", "123456789012")
DataZoneMskEnvironmentAuthorizer
The DataZoneMskEnvironmentAuthorizer
is responsible from managing the permissions required to grant access on MSK Topics (and associated Glue Schema Registry) via IAM policies.
The workflow is a Step Functions State Machine that is triggered by events emitted by the DataZoneMskCentralAuthorizer
and contains the following steps:
- Grant the producer or consumer based on the request. If the event is a cross-account producer grant, a Lambda function adds an IAM policy statement to the MSK Cluster policy granting read access to the IAM consumer Role. Optionally, it can also grant the use of MSK Managed VPC.
- Callback the
DataZoneMskCentralAuthorizer
: an EventBridge event is sent on the central EventBridge Bus to continue the workflow on the central account using the callback mechanism of Step Functions.
Usage
- TypeScript
- Python
new dsf.governance.DataZoneMskEnvironmentAuthorizer(this, 'MskAuthorizer', {
domainId: 'aba_dc999t9ime9sss',
});
dsf.governance.DataZoneMskEnvironmentAuthorizer(self, "MskAuthorizer",
domain_id="aba_dc999t9ime9sss"
)
Restricting IAM permissions on consumer roles with IAM permissions boundary
The construct is based on a Lambda Function that grants IAM Roles with policies using the IAM API PutRolePolicy
.
Permissions applied to the consumer Roles can be restricted using IAM Permissions Boundaries. The DataZoneMskEnvironmentAuthorizer
construct provides a static member containing the IAM Statement to include in the IAM permission boundary of the consumer role.
- TypeScript
- Python
const permissionBoundaryPolicy = new ManagedPolicy(this, 'PermissionBoundaryPolicy', {
statements: [
// example of other permissions needed by the consumer
new PolicyStatement({
effect: Effect.ALLOW,
actions: ['s3:*'],
resources: ['*'],
}),
// permissions needed to consume MSK topics and granted by the Authorizer
dsf.governance.DataZoneMskEnvironmentAuthorizer.PERMISSIONS_BOUNDARY_STATEMENTS
],
})
new Role(this, 'ConsumerRole', {
assumedBy: new ServicePrincipal('lambda.amazonaws.com'),
permissionsBoundary: permissionBoundaryPolicy,
})
permission_boundary_policy = ManagedPolicy(self, "PermissionBoundaryPolicy",
statements=[
# example of other permissions needed by the consumer
PolicyStatement(
effect=Effect.ALLOW,
actions=["s3:*"],
resources=["*"]
), dsf.governance.DataZoneMskEnvironmentAuthorizer.PERMISSIONS_BOUNDARY_STATEMENTS
]
)
Role(self, "ConsumerRole",
assumed_by=ServicePrincipal("lambda.amazonaws.com"),
permissions_boundary=permission_boundary_policy
)
Cross account workflow
If the DataZoneMskEnvironmentAuthorizer
is deployed in a different account than the DataZone root account where the DataZoneMskCentralAuthorizer
is deployed, you need to configure the central account ID to authorize cross-account communication:
- TypeScript
- Python
new dsf.governance.DataZoneMskEnvironmentAuthorizer(this, 'MskAuthorizer', {
domainId: 'aba_dc999t9ime9sss',
centralAccountId: '123456789012'
});
dsf.governance.DataZoneMskEnvironmentAuthorizer(self, "MskAuthorizer",
domain_id="aba_dc999t9ime9sss",
central_account_id="123456789012"
)
Granting MSK Managed VPC connectivity
For easier cross-account Kafka consumption, MSK Provisioned clusters can use the multi-VPC private connectivity feature which is a managed solution that simplifies the networking infrastructure for multi-VPC and cross-account connectivity.
By default, the multi-VPC private connectivity permissions are not configured. You can enable it using the construct properties:
- TypeScript
- Python
new dsf.governance.DataZoneMskEnvironmentAuthorizer(this, 'MskAuthorizer', {
domainId: 'aba_dc999t9ime9sss',
centralAccountId: '123456789012',
grantMskManagedVpc: true,
});
dsf.governance.DataZoneMskEnvironmentAuthorizer(self, "MskAuthorizer",
domain_id="aba_dc999t9ime9sss",
central_account_id="123456789012",
grant_msk_managed_vpc=True
)