com.cloudera.cdk.data.crunch
Class CrunchDatasets
java.lang.Object
com.cloudera.cdk.data.crunch.CrunchDatasets
@Beta
public class CrunchDatasets
- extends Object
A helper class for exposing a filesystem-based dataset as a Crunch
ReadableSource
or Target
.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CrunchDatasets
public CrunchDatasets()
asSource
public static <E> ReadableSource<E> asSource(Dataset<E> dataset,
Class<E> type)
- Expose the given
Dataset
as a Crunch ReadableSource
.
Only the FileSystem Dataset
implementation is supported and the
file format must be Formats.PARQUET
or Formats.AVRO
.
- Type Parameters:
E
- the type of entity produced by the source- Parameters:
dataset
- the dataset to read fromtype
- the Java type of the entities in the dataset
- Returns:
- the
ReadableSource
, or null
if the dataset is not
filesystem-based.
asTarget
public static Target asTarget(Dataset dataset)
- Expose the given
Dataset
as a Crunch Target
.
Only the FileSystem Dataset
implementation is supported and the
file format must be Formats.PARQUET
or Formats.AVRO
. In
addition, the given Dataset
must not be partitioned,
or must be a leaf partition in the partition hierarchy.
The Target
returned by this method will not write to
sub-partitions.
- Parameters:
dataset
- the dataset to write to
- Returns:
- the
Target
, or null
if the dataset is not
filesystem-based.
Copyright © 2013 Cloudera. All rights reserved.