Creates an AWS datalake.




Creates an AWS datalake.


--datalake-name <value>
--environment-name <value>
--cloud-provider-configuration <value>
[--scale <value>]
[--tags <value>]
[--runtime <value>]
[--image <value>]
[--enable-ranger-raz | --no-enable-ranger-raz]
[--multi-az | --no-multi-az]
[--recipes <value>]
[--custom-instance-groups <value>]
[--cli-input-json <value>]


--datalake-name (string)

The datalake name. This name must be unique, must have between 5 and 100 characters, and must contain only lowercase letters, numbers and hyphens. Names are case-sensitive.

--environment-name (string)

The environment name or CRN.

--cloud-provider-configuration (object)

Request object for AWS configuration.

instanceProfile -> (string)

The ARN of an IAM instance profile.

storageBucketLocation -> (string)

The location of the S3 bucket to be used as storage. The location has to start with s3a:// followed by the bucket name.

Shorthand Syntax:


JSON Syntax:

  "instanceProfile": "string",
  "storageBucketLocation": "string"

--scale (string)

Represents the available datalake scales. Defaults to LIGHT_DUTY if not set.

Possible values:



--tags (array)

Tags to be added to Data Lake related resources.

Shorthand Syntax:

key=string,value=string ... (separate items with spaces)

JSON Syntax:

    "key": "string",
    "value": "string"

--runtime (string)

Cloudera Runtime version.

--image (object)

The image request for the datalake. This must not be set if the runtime parameter is provided. The image ID parameter is required if this is present, but the image catalog name is optional, defaulting to ‘cdp-default’ if not present.

id -> (string)

The image ID from the catalog. The corresponding image will be used for the created cluster machines.

catalogName -> (string)

The name of the custom image catalog to use.

Shorthand Syntax:


JSON Syntax:

  "id": "string",
  "catalogName": "string"

--enable-ranger-raz | --no-enable-ranger-raz (boolean)

Whether to enable Ranger RAZ for the datalake. Defaults to not being enabled.

--multi-az | --no-multi-az (boolean)

Controls if the datalake is deployed in a multi-availability zone way.

--recipes (array)

Additional recipes that will be attached on the datalake instances (by instance groups, most common ones are like ‘master’ or ‘idbroker’).

Shorthand Syntax:

instanceGroupName=string,recipeNames=string,string ... (separate items with spaces)

JSON Syntax:

    "instanceGroupName": "string",
    "recipeNames": ["string", ...]

--custom-instance-groups (array)

Configure custom properties on an instance group level.

Shorthand Syntax:

name=string,instanceType=string ... (separate items with spaces)

JSON Syntax:

    "name": "string",
    "instanceType": "string"

--cli-input-json (string)

Performs service operation based on the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, the CLI values will override the JSON-provided values.

--generate-cli-skeleton (boolean)

Prints a sample input JSON to standard output. Note the specified operation is not run if this argument is specified. The sample input can be used as an argument for --cli-input-json.


datalake -> (object)

Information about a datalake.

datalakeName -> (string)

The name of the datalake.

crn -> (string)

The CRN of the datalake.

status -> (string)

The status of the datalake.

environmentCrn -> (string)

The CRN of the environment.

creationDate -> (datetime)

The date when the datalake was created.

statusReason -> (string)

The reason for the status of the datalake.

enableRangerRaz -> (boolean)

Whether Ranger RAZ is enabled for the datalake.

certificateExpirationState -> (string)

Indicates the certificate status on the cluster.

multiAz -> (boolean)

Flag which marks that the datalake is deployed in a multi-availability zone way or not.

Form Factors