cdk:create-dataset
Full name:
com.cloudera.cdk:cdk-maven-plugin:0.8.0:create-dataset
Description:
Create a named dataset whose entries conform to a defined schema.
Attributes:
Required Parameters
Name | Type | Since | Description |
---|---|---|---|
avroSchemaFile | String | - | The file containing the Avro schema. If no file with the specified name is found on the local filesystem, then the classpath is searched for a matching resource. User property is: cdk.avroSchemaFile. |
datasetName | String | - | The name of the dataset to create. User property is: cdk.datasetName. |
Optional Parameters
Name | Type | Since | Description |
---|---|---|---|
format | String | - | The file format (avro or parquet). User property is: cdk.format. |
hadoopConfiguration | Properties | - | Hadoop configuration properties. User property is: cdk.hadoopConfiguration. |
hcatalog | boolean | - | If true, store dataset metadata in HCatalog, otherwise store it on the filesystem. User property is: cdk.hcatalog. |
partitionExpression | String | - | The partition expression, in JEXL format (experimental). User property is: cdk.partitionExpression. |
rootDirectory | String | - | The root directory of the dataset repository. Optional if using HCatalog for metadata storage. User property is: cdk.rootDirectory. |
Parameter Details
The file containing the Avro schema. If no file with the specified name is found on the local filesystem, then the classpath is searched for a matching resource.
- Type: java.lang.String
- Required: Yes
- User Property: cdk.avroSchemaFile
The name of the dataset to create.
- Type: java.lang.String
- Required: Yes
- User Property: cdk.datasetName
The file format (avro or parquet).
- Type: java.lang.String
- Required: No
- User Property: cdk.format
Hadoop configuration properties.
- Type: java.util.Properties
- Required: No
- User Property: cdk.hadoopConfiguration
If true, store dataset metadata in HCatalog, otherwise store it on the filesystem.
- Type: boolean
- Required: No
- User Property: cdk.hcatalog
The partition expression, in JEXL format (experimental).
- Type: java.lang.String
- Required: No
- User Property: cdk.partitionExpression
The root directory of the dataset repository. Optional if using HCatalog for metadata storage.
- Type: java.lang.String
- Required: No
- User Property: cdk.rootDirectory