com.cloudera.cdk.data.crunch
Class CrunchDatasets

java.lang.Object
  extended by com.cloudera.cdk.data.crunch.CrunchDatasets

@Beta
public class CrunchDatasets
extends Object

A helper class for exposing a filesystem-based dataset as a Crunch ReadableSource or Target.


Constructor Summary
CrunchDatasets()
           
 
Method Summary
static
<E> ReadableSource<E>
asSource(Dataset<E> dataset, Class<E> type)
          Expose the given Dataset as a Crunch ReadableSource.
static Target asTarget(Dataset dataset)
          Expose the given Dataset as a Crunch Target.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CrunchDatasets

public CrunchDatasets()
Method Detail

asSource

public static <E> ReadableSource<E> asSource(Dataset<E> dataset,
                                             Class<E> type)
Expose the given Dataset as a Crunch ReadableSource. Only the FileSystem Dataset implementation is supported and the file format must be Formats.PARQUET or Formats.AVRO.

Type Parameters:
E - the type of entity produced by the source
Parameters:
dataset - the dataset to read from
type - the Java type of the entities in the dataset
Returns:
the ReadableSource, or null if the dataset is not filesystem-based.

asTarget

public static Target asTarget(Dataset dataset)
Expose the given Dataset as a Crunch Target. Only the FileSystem Dataset implementation is supported and the file format must be Formats.PARQUET or Formats.AVRO. In addition, the given Dataset must not be partitioned, or must be a leaf partition in the partition hierarchy. The Target returned by this method will not write to sub-partitions.

Parameters:
dataset - the dataset to write to
Returns:
the Target, or null if the dataset is not filesystem-based.


Copyright © 2013 Cloudera. All rights reserved.