RedshiftServerlessWorkgroup

A Redshift Serverless Workgroup with helpers method for Redshift administration.

Overview

RedshiftServerlessWorkgroup is a Redshift Serverless Workgroup with the following options/capabilities:

Deployed in a VPC in private subnets. The network configuation can be customized.
Provide helper methods for running SQL commands via the Redshift Data API. Commands can be custom or predefined for common administration tasks like creating and granting roles.
Initialize a Glue Data Catalog integration with auto crawling via Glue Crawlers. This would allow tables in Redshift Serverless to appear in the Glue Data Catalog for the purposes of discovery and integration.

warning

The default VPC created by the construct follows the standard implementation of the VPC L2 CDK Construct. As a result, if no account ID and no region are configured in the CDK Stack, the VPC will only contain 2 AZ and the Redshift Serverless Workgroup will fail to deploy. Be sure to configure the account ID and region to create a 3 AZ VPC.

Usage

TypeScript
Python

class ExampleDefaultRedshiftServerlessWorkgroupStack extends Stack {
constructor(scope: Construct, id: string) {
  super(scope, id)

  const namespace = new dsf.consumption.RedshiftServerlessNamespace(this, 'DefaultRedshiftServerlessNamespace', {
    name: "default",
    dbName: 'defaultdb',
  })

  new dsf.consumption.RedshiftServerlessWorkgroup(this, "DefaultRedshiftServerlessWorkgroup", {
    name: "default",
    namespace: namespace,
  })
}
}

class ExampleDefaultRedshiftServerlessWorkgroupStack(Stack):
  def __init__(self, scope, id):
      super().__init__(scope, id)

      namespace = dsf.consumption.RedshiftServerlessNamespace(self, "DefaultRedshiftServerlessNamespace",
          name="default",
          db_name="defaultdb"
      )

      dsf.consumption.RedshiftServerlessWorkgroup(self, "DefaultRedshiftServerlessWorkgroup",
          name="default",
          namespace=namespace
      )

Bootstrapping Redshift Serverless w/ RedshiftData Construct

The RedshiftData construct allows custom SQLs to run against the RedshiftServerlessWorkgroup via the Data API. This allows users to bootstrap Redshift directly from CDK.

The RedshiftData construct provides the following helpers for bootstrapping Redshift databases:

Run a custom SQL command
Create Redshift roles
Grant Redshift roles full access to schemas
Grant Redshift roles read only access
Run a COPY command to load data

TypeScript
Python

  const workgroup = new dsf.consumption.RedshiftServerlessWorkgroup(this, "DefaultRedshiftServerlessWorkgroup", {
    name: "default",
    namespace: namespace,
  })

  // Run a custom SQL to create a customer table
  const createTable = workgroup.runCustomSQL('CreateCustomerTable', "defaultdb",
    `
    CREATE TABLE customer(
      customer_id varchar(50),
      salutation varchar(5),
      first_name varchar(50),
      last_name varchar(50),
      email_address varchar(100)
    )
    diststyle even
    `,
    "drop table customer"
  );

  // Run a COPY command to load data into the customer table
  const ingestion = workgroup.ingestData('ExampleCopy', "defaultdb", "customer", bucket, "data-products/customer/", "csv ignoreheader 1");

  // Add dependencies between Redshift Data API commands because CDK cannot infer them
  ingestion.node.addDependency(createTable);

  // Create an engineering role in the defaultdb
  const dbRole = workgroup.createDbRole('EngineeringRole', 'defaultdb', 'engineering');

  // Grant the engineering role full access to the public schema in the defaultdb
  const dbSchema = workgroup.grantDbSchemaToRole('EngineeringGrant', 'defaultdb', 'public', 'engineering');

  // Enforce dependencies
  dbSchema.node.addDependency(dbRole);

workgroup = dsf.consumption.RedshiftServerlessWorkgroup(self, "DefaultRedshiftServerlessWorkgroup",
  name="default",
  namespace=namespace
)

# Run a custom SQL to create a customer table
create_table = workgroup.run_custom_sQL("CreateCustomerTable", "defaultdb", """
        CREATE TABLE customer(
          customer_id varchar(50),
          salutation varchar(5),
          first_name varchar(50),
          last_name varchar(50),
          email_address varchar(100)
        )
        diststyle even
        """, "drop table customer")

# Run a COPY command to load data into the customer table
ingestion = workgroup.ingest_data("ExampleCopy", "defaultdb", "customer", bucket, "data-products/customer/", "csv ignoreheader 1")

# Add dependencies between Redshift Data API commands because CDK cannot infer them
ingestion.node.add_dependency(create_table)

# Create an engineering role in the defaultdb
db_role = workgroup.create_db_role("EngineeringRole", "defaultdb", "engineering")

# Grant the engineering role full access to the public schema in the defaultdb
db_schema = workgroup.grant_db_schema_to_role("EngineeringGrant", "defaultdb", "public", "engineering")

# Enforce dependencies
db_schema.node.add_dependency(db_role)

Cataloging Redshift Serverless Tables

Redshift tables and databases can also be automatically catalog in Glue Data Catalog using an helper method. This method creates a Glue Catalog database as well as a crawler to populate the database with table metadata from your Redshift database.

The default value of the path that the crawler would use is <databaseName>/public/% which translates to all the tables in the public schema. Please refer to the crawler documentation for more information for JDBC data sources.

TypeScript
Python

class ExampleRedshiftServerlessWorkgroupCatalogStack extends Stack {
constructor(scope: Construct, id: string) {
  super(scope, id)

  const namespace = new dsf.consumption.RedshiftServerlessNamespace(this, 'DefaultRedshiftServerlessNamespace', {
    name: "default",
    dbName: 'defaultdb',
  })

  const workgroup = new dsf.consumption.RedshiftServerlessWorkgroup(this, "DefaultRedshiftServerlessWorkgroup", {
    name: "default",
    namespace: namespace,
  })

  workgroup.catalogTables('RedshiftCatalog', "example-redshift-db", 'defaultdb/public/%')
}
}

class ExampleRedshiftServerlessWorkgroupCatalogStack(Stack):
  def __init__(self, scope, id):
      super().__init__(scope, id)

      namespace = dsf.consumption.RedshiftServerlessNamespace(self, "DefaultRedshiftServerlessNamespace",
          name="default",
          db_name="defaultdb"
      )

      workgroup = dsf.consumption.RedshiftServerlessWorkgroup(self, "DefaultRedshiftServerlessWorkgroup",
          name="default",
          namespace=namespace
      )

      workgroup.catalog_tables("RedshiftCatalog", "example-redshift-db", "defaultdb/public/%")

Overview​

Usage​

Bootstrapping Redshift Serverless w/ RedshiftData Construct​

Cataloging Redshift Serverless Tables​

Overview

Usage

Bootstrapping Redshift Serverless w/ RedshiftData Construct

Cataloging Redshift Serverless Tables