Creates an Azure datalake.




Creates an Azure datalake.


--datalake-name <value>
--environment-name <value>
--cloud-provider-configuration <value>
[--scale <value>]
[--tags <value>]
[--runtime <value>]
[--image <value>]
[--load-balancer-sku <value>]
[--enable-ranger-raz | --no-enable-ranger-raz]
[--database-type <value>]
[--flexible-server-delegated-subnet-id <value>]
[--recipes <value>]
[--java-version <value>]
[--multi-az | --no-multi-az]
[--cli-input-json <value>]


--datalake-name (string)

The datalake name. This name must be unique, must have between 5 and 40 characters, and must contain only lowercase letters, numbers and hyphens. Names are case-sensitive.

--environment-name (string)

The environment name or CRN.

--cloud-provider-configuration (object)

Request object for Azure configuration.

managedIdentity -> (string)

The managed identity to use. The assumer should have Virtual Machine Contributor and Managed Identity Operator roles on subscription level.

storageLocation -> (string)

The storage location to use. The location has to be in the following format abfs:// The filesystem must already exist and the storage account must be StorageV2.

Shorthand Syntax:


JSON Syntax:

  "managedIdentity": "string",
  "storageLocation": "string"

--scale (string)

Represents the available datalake scales. Defaults to LIGHT_DUTY if not set.

Possible values:




--tags (array)

Tags to be added to Data Lake related resources.

Shorthand Syntax:

key=string,value=string ... (separate items with spaces)

JSON Syntax:

    "key": "string",
    "value": "string"

--runtime (string)

Cloudera Runtime version.

--image (object)

The image request for the datalake. When the ‘runtime’ parameter is set, only the ‘os’ parameter can be provided. Otherwise, you can use ‘catalog name’ and/or ‘id’ for selecting an image.

id -> (string)

The image ID from the catalog. The corresponding image will be used for the created cluster machines.

catalogName -> (string)

The name of the custom image catalog to use, defaulting to ‘cdp-default’ if not present.

os -> (string)

The OS of the image used for cluster instances.

Shorthand Syntax:


JSON Syntax:

  "id": "string",
  "catalogName": "string",
  "os": "string"

--load-balancer-sku (string)

Represents the Azure load balancer SKU type. The current default is BASIC. To disable the load balancer, use type NONE.

Possible values:



  • NONE

--enable-ranger-raz | --no-enable-ranger-raz (boolean)

Whether to enable Ranger RAZ for the datalake. Defaults to not being enabled.

--database-type (string)

The type of the azure database. FLEXIBLE_SERVER is the next generation managed PostgreSQL service in Azure that provides maximum flexibility over your database, built-in cost-optimizations. SINGLE_SERVER is a fully managed database service with minimal requirements for customizations of the database.

Possible values:



--flexible-server-delegated-subnet-id (string)

This argument allows you to specify the subnet ID for the subnet within which you want to configure your Azure Flexible Server.

--recipes (array)

Additional recipes that will be attached on the datalake instances (by instance groups, most common ones are like ‘master’ or ‘idbroker’).

Shorthand Syntax:

instanceGroupName=string,recipeNames=string,string ... (separate items with spaces)

JSON Syntax:

    "instanceGroupName": "string",
    "recipeNames": ["string", ...]

--java-version (integer)

Configure the major version of Java on the cluster.

--multi-az | --no-multi-az (boolean)

Creates CDP datalake distributed across multiple availability zones in an Azure region.

--cli-input-json (string)

Performs service operation based on the JSON string provided. The JSON string follows the format provided by --generate-cli-skeleton. If other arguments are provided on the command line, the CLI values will override the JSON-provided values.

--generate-cli-skeleton (boolean)

Prints a sample input JSON to standard output. Note the specified operation is not run if this argument is specified. The sample input can be used as an argument for --cli-input-json.


datalake -> (object)

Information about a datalake.

datalakeName -> (string)

The name of the datalake.

crn -> (string)

The CRN of the datalake.

status -> (string)

The status of the datalake.

environmentCrn -> (string)

The CRN of the environment.

creationDate -> (datetime)

The date when the datalake was created.

statusReason -> (string)

The reason for the status of the datalake.

enableRangerRaz -> (boolean)

Whether Ranger RAZ is enabled for the datalake.

certificateExpirationState -> (string)

Indicates the certificate status on the cluster.

multiAz -> (boolean)

Flag which marks that the datalake is deployed in a multi-availability zone way or not.

tags -> (array)

Datalake tags object containing the tag values defined for the datalake.

item -> (object)

Tag for a datalake resource.

key -> (string)

The key of tag.

value -> (string)

The value of the tag.

Form Factors