com.cloudera.crunch
Interface PTable<K,V>

All Superinterfaces:
PCollection<Pair<K,V>>
All Known Implementing Classes:
DoTableImpl, InputTable, MemTable, PTableBase, UnionTable

public interface PTable<K,V>
extends PCollection<Pair<K,V>>

A sub-interface of PCollection that represents an immutable, distributed multi-map of keys and values.


Method Summary
 PTable<K,V> bottom(int count)
          Returns a PTable made up of the pairs in this PTable with the smallest value field.
<U> PTable<K,Pair<Collection<V>,Collection<U>>>
cogroup(PTable<K,U> other)
          Co-group operation with the given table on common keys.
 PTable<K,Collection<V>> collectValues()
          Aggregate all of the values with the same key into a single key-value pair in the returned PTable.
 PType<K> getKeyType()
          Returns the PType of the key.
 PTableType<K,V> getPTableType()
          Returns the PTableType of this PTable.
 PType<V> getValueType()
          Returns the PType of the value.
 PGroupedTable<K,V> groupByKey()
          Performs a grouping operation on the keys of this table.
 PGroupedTable<K,V> groupByKey(GroupingOptions options)
          Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.
 PGroupedTable<K,V> groupByKey(int numPartitions)
          Performs a grouping operation on the keys of this table, using the given number of partitions.
<U> PTable<K,Pair<V,U>>
join(PTable<K,U> other)
          Perform an inner join on this table and the one passed in as an argument on their common keys.
 PTable<K,V> top(int count)
          Returns a PTable made up of the pairs in this PTable with the largest value field.
 PTable<K,V> union(PTable<K,V>... others)
          Returns a PTable instance that acts as the union of this PTable and the input PTables.
 PTable<K,V> write(Target target)
          Writes this PTable to the given Target.
 
Methods inherited from interface com.cloudera.crunch.PCollection
count, filter, getName, getPipeline, getPType, getSize, getTypeFamily, materialize, max, min, parallelDo, parallelDo, parallelDo, parallelDo, sample, sample, sort, union
 

Method Detail

union

PTable<K,V> union(PTable<K,V>... others)
Returns a PTable instance that acts as the union of this PTable and the input PTables.


groupByKey

PGroupedTable<K,V> groupByKey()
Performs a grouping operation on the keys of this table.

Returns:
a PGroupedTable instance that represents the grouping

groupByKey

PGroupedTable<K,V> groupByKey(int numPartitions)
Performs a grouping operation on the keys of this table, using the given number of partitions.

Parameters:
numPartitions - The number of partitions for the data.
Returns:
a PGroupedTable instance that represents this grouping

groupByKey

PGroupedTable<K,V> groupByKey(GroupingOptions options)
Performs a grouping operation on the keys of this table, using the additional GroupingOptions to control how the grouping is executed.

Parameters:
options - The grouping options to use
Returns:
a PGroupedTable instance that represents the grouping

write

PTable<K,V> write(Target target)
Writes this PTable to the given Target.

Specified by:
write in interface PCollection<Pair<K,V>>
Parameters:
target - The target to write to

getPTableType

PTableType<K,V> getPTableType()
Returns the PTableType of this PTable.


getKeyType

PType<K> getKeyType()
Returns the PType of the key.


getValueType

PType<V> getValueType()
Returns the PType of the value.


collectValues

PTable<K,Collection<V>> collectValues()
Aggregate all of the values with the same key into a single key-value pair in the returned PTable.


top

PTable<K,V> top(int count)
Returns a PTable made up of the pairs in this PTable with the largest value field.

Parameters:
count - The number of pairs to return

bottom

PTable<K,V> bottom(int count)
Returns a PTable made up of the pairs in this PTable with the smallest value field.

Parameters:
count - The number of pairs to return

join

<U> PTable<K,Pair<V,U>> join(PTable<K,U> other)
Perform an inner join on this table and the one passed in as an argument on their common keys.


cogroup

<U> PTable<K,Pair<Collection<V>,Collection<U>>> cogroup(PTable<K,U> other)
Co-group operation with the given table on common keys.



Copyright © 2012. All Rights Reserved.