|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
PCollection
instances.SourceTarget
types.InputFormat
for Avro data files.OutputFormat
for Avro data files.RecordReader
for Avro data files.AvroTypeFamily
for convenient static importing.InputFormat
for text files.DoFn
is associated with.
PTable
arguments.
DoFn
implementation that converts an Iterable
of values into a single value.CombineFn
that delegates all of the actual work to an Aggregator
instance.CombineFn
.
comm
utility.
DoFn
, or takes the output of a DoFn
and write it to
the output key/values.PTable
that contains the unique elements of this
collection mapped to a count of their occurrences.
PTable
instance that contains the counts of each unique
element of this PCollection.
Tool
interface that creates a Pipeline
instance and provides methods for working with the Pipeline from inside of
the Tool's run method.run
.
DoFn
.PCollection
.
DoFn
for the common case of filtering the members of
a PCollection
based on a boolean condition.Source
types.Configuration
instance associated with this pipeline.
SourceTarget
that is able to read/write data using the
serialization format specified by this PType
.
PTypeFamily
that this PType
belongs to.
PType
of the key.
Pipeline
associated with this PCollection.
InputSplit
is complete.
PTableType
of this PTable
.
PType
of this PCollection
.
PCollection
in bytes.
Source
.
PType
for this source.
PType
.
PTypeFamily
of this PCollection
.
PType
of the value.
GroupingOptions
to control how the grouping is
executed.
groupByKey
operation in order to exercise
finer control over how the partitioning, grouping, and sorting of keys is
performed.GroupingOptions
instances.DoFn
is associated with.
Emitter
implementation that links the output of one DoFn
to the input of another DoFn
.DoNode
instances in a job and builds
a String that identifies the stages of the pipeline that belong to
this job.PTable
instances based on a common
key.DoFn
for the common case of emitting exactly one value
for each input record.PCollection
made up of only the maximum element of this
instance.
PCollection
made up of only the minimum element of this
instance.
DoNode
instance, so we know
how to use it within the context of a particular MR job.Tuple
s.PCollection
and
returns a new PCollection
that is the output of this processing.
PCollection
and
returns a new PCollection
that is the output of this processing.
parallelDo
instance, but returns a
PTable
instance instead of a PCollection
.
parallelDo
instance, but returns a
PTable
instance instead of a PCollection
.
PTable
.PType
instance for PGroupedTable
instances.PCollection
.
PCollection
that represents an immutable,
distributed multi-map of keys and values.PType
specifically for PTable
objects.PType
defines a mapping between a data type that is used in a
Crunch pipeline and a serialization and storage format that is used to
read/write data from/to HDFS.PType
instances that have the same
serialization/storage backing format.PType
s from different PTypeFamily
implementations.Source
into a PCollection
that is
available to jobs run using this Pipeline
instance.
TableSource
instances that
map to PTable
s.
SourceTarget
instance can be
read into the local client.Serialization
used by jobs configured with AvroJob
.PCollection
will cause it to change in side.
PCollection
instances.Configuration
to use with this pipeline.
Configuration
instance to be used during unit tests.
TaskInputOutputContext
to
this DoFn
instance.
PCollection
instances.PCollection
using the natural ordering of its elements.
PCollection
using the natural ordering of its elements
in the order specified.
PTable
using the natural ordering of its keys.
PTable
using the natural ordering of its keys
in the order specified.
PCollection
instance that contains all of the elements
of this instance in sorted order.
sortPairs(coll, by(2, ASCENDING), by(1, DESCENDING))
Column numbering is 1-based.PCollection
of Pair
s using the specified column
ordering.
PCollection
of Tuple4
s using the specified column
ordering.
PCollection
of Tuple3
s using the specified column
ordering.
PCollection
of TupleN
s using the specified column
ordering.
Source
represents an input data set that is an input to one
or more MapReduce jobs.Source
and
the Target
interfaces.Source
implementations that return a PTable
.Target
represents the output destination of a Crunch job.Target
types.PCollection
s.Tuple
s.Tuple
s.Tuple
instance for an arbitrary number of values.Tuple
interface.PCollection
instance that acts as the union
of this PCollection
and the input PCollection
s.
PTable
instance that acts as the union
of this PTable
and the input PTable
s.
WritableTypeFamily
for convenient static importing.Writable
-based implementation of the PTypeFamily
interface.PCollection
to the given Target
,
using the storage format specified by the target.
PTable
to the given Target
.
out
.
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |