aws_ddk_core.resources.DataBrewFactory¶

class aws_ddk_core.resources.DataBrewFactory¶

Class factory to create and configure DataBrew DDK resources, including Jobs.

__init__()¶

Methods

`__init__`()
`job`(scope, id, environment_id, name, ...[, ...])	Create and configure a DataBrew job.

static job(scope: constructs.Construct, id: str, environment_id: str, name: str, role_arn: str, type: str, dataset_name: Optional[str] = None, recipe: Optional[aws_cdk.aws_databrew.CfnJob.RecipeProperty] = None, encryption_mode: Optional[str] = None, log_subscription: Optional[str] = None, max_capacity: Optional[int] = None, max_retries: Optional[int] = None, output_location: Optional[aws_cdk.aws_databrew.CfnJob.OutputLocationProperty] = None, outputs: Optional[Sequence[aws_cdk.aws_databrew.CfnJob.OutputProperty]] = None, timeout: Optional[aws_cdk.Duration] = None, **job_props: Any) → aws_cdk.aws_databrew.CfnJob¶

Create and configure a DataBrew job.

This construct allows to configure parameters of the job using ddk.json configuration file depending on the environment_id in which the job is used. Supported parameters are: max_capacity,`max_retries`, timeout

The parameters are respected in the following order: 1 - Explicit arguments are always preferred 2 - Values from configuration file 3 - Defaults are used otherwise

Parameters

scope (Construct) – Scope within which this construct is defined
id (str) – Identifier of the DataBrew job
environment_id (str) – Identifier of the environment in which the job is used
name (str) – Name of the DataBrew job
role_arn (Optional[str]) – Arn of the execution role of the DataBrew job
type (str) –

The type of the DataBrew job, which must be one of the following:
PROFILE - A job to analyze a dataset, to determine its size, data types, data distribution, and more. RECIPE - A job to apply one or more transformations to a dataset.
dataset_name (Optional[str]) – The name of the DataBrew dataset to be processed by the DataBrew job
recipe (Optional[databrew.CfnJob.RecipeProperty]) – The recipe to be used by the DataBrew job which is a series of data transformation steps.
encryption_mode (Optional[str]) –

The encryption mode to be used by the DataBrew job, which can be one of the following:
SSE-KMS - Server-side encryption with keys managed by AWS KMS. SSE-S3 - Server-side encryption with keys managed by Amazon S3.
log_subscription (Optional[str]) – The status of the Amazon Cloudwatch logging for the DataBrew job
max_capacity (Optional[int]) – The maximum number of nodes that can be consumed by the DataBrew job.
max_retries (Optional[int]) – The maximum number of times to retry the DataBrew job
output_location (Optional[databrew.CfnJob.OutputLocationProperty]) – Output location to be used by the DataBrew job
outputs (Optional[Sequence[databrew.CfnJob.OutputProperty]]) – One or more output artifacts that represent the output of the DataBrew job
timeout (Optional[cdk.Duration]) – The job execution time (in seconds) after which DataBrew terminates the job. aws_cdk.Duration.seconds(3600) by default.
job_props (Any) – Additional job properties. For complete list of properties refer to CDK Documentation - DataBrew Job: https://docs.aws.amazon.com/cdk/api/v2/python/aws_cdk.aws_databrew/CfnJob.html

Returns

job – DataBrew job

Return type

databrew.CfnJob