Orbit team(s) get AWS Redshift cluster interaction capability using Orbit Redshift plugin. Please refer to plugin-library more details about addition of plugin to team deployment.
To interact with the Redshift cluster(s), Orbit SDK provides abstracted helper functions over AWS Boto3 calls, which are further integrated with Orbit JupyterLab iPython magic commands.
Below explain the various ways the Orbit Workbench allows team user(s) to interact with AWS Redshift cluster(s).
Connect to an existing cluster or create a new cluster if it does not exists.
>>> %connect_to_redshift -cluster db-test -reuse -start -func Standard Nodes=3
'-cluster' : str Specify cluster name
'-start' : bool, optional Start cluster if it does not exists (default False)
'-reuse' : bool, optional Reuse cluster if cluster name exists (default True)
'-func' : str, optional String based options parsing.
First parameter - Standard or custom lambda name postfix
Standard - Default orbit lambda used to create and connect to cluster.
custom lambda name postfix - Used to create a custom redshift cluster.
Remaining parameters specific to redshift arguments(e.g. Nodes, max_concurrency_scaling_clusters, etc.)(default None).
cluster_identifier = f"orbit-{env}-{team_space}-{cluster_name}".lower()
Delete an existing Redshift cluster
>>> %delete_redshift_cluster -cluster db-test
'-cluster' : str Specify cluster name
Connect to a Redshift Cluster and return connection information once redshift cluster is available for use.
>>> from aws.utils.notebooks.database import RedshiftUtils
>>> from aws.utils.notebooks.common import get_properties
>>> response = RedshiftUtils.connect_to_redshift(cluster_name= 'cluster-test')
>>> print(response)
{
"db_url": <redshift+psycopg2://..>,
"engine": <engine>,
"cluster_identifier": <cluster name>,
"started": <started>,
"redshift_role": <'arn:aws:iam::{accountid}:role/...'>,
}
cluster_name : str Name of the redshift cluster
reuseCluster : bool, optional Boolean determining if you wish to reuse an existing cluster name (default True).
startCluster : bool, optional Boolean determining if you wish to start a new cluster (default False)
clusterArgs : dict(), optional Other redshift parameters you can optionally specify (e.g. auto_analyze, max_concurrency_scaling_clusters, statement_timeout, etc.)
db_url : str A sqlalchemy connection string
engine: sqlalchemy.engine.Engine A sql alchemy engine
cluster_identifier: str The unique identifier of the cluster
started: bool Boolean representing if cluster has started or not
redshift_role: str The redshift role ARN that can be used to access other AWS services when you execute the Amazon Redshift command
The JupyterLab UI has Compute section holding Redshift cluster display list, creation and deletion buttons. Below screenshot displays the model dialog requesting for cluster name, number of nodes and node type.
Click “+” icon to create new cluster. By selecting existing cluster in the list, click ‘x’ icon to delete the cluster.