com.cloudera.crunch
Class FilterFn<T>

java.lang.Object
  extended by com.cloudera.crunch.DoFn<T,T>
      extended by com.cloudera.crunch.FilterFn<T>
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
FilterFn.AndFn, FilterFn.NotFn, FilterFn.OrFn

public abstract class FilterFn<T>
extends DoFn<T,T>

A DoFn for the common case of filtering the members of a PCollection based on a boolean condition.

See Also:
Serialized Form

Nested Class Summary
static class FilterFn.AndFn<S>
           
static class FilterFn.NotFn<S>
           
static class FilterFn.OrFn<S>
           
 
Constructor Summary
FilterFn()
           
 
Method Summary
abstract  boolean accept(T input)
          If true, emit the given record.
static
<S> FilterFn<S>
and(FilterFn<S>... fns)
           
static
<S> FilterFn<S>
not(FilterFn<S> fn)
           
static
<S> FilterFn<S>
or(FilterFn<S>... fns)
           
 void process(T input, Emitter<T> emitter)
          Processes the records from a PCollection.
 float scaleFactor()
          Returns an estimate of how applying this function to a PCollection will cause it to change in side.
 
Methods inherited from class com.cloudera.crunch.DoFn
cleanup, configure, getConfiguration, getCounter, getCounter, getStatus, getTaskAttemptID, initialize, progress, setConfigurationForTest, setContext, setStatus
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FilterFn

public FilterFn()
Method Detail

accept

public abstract boolean accept(T input)
If true, emit the given record.


process

public void process(T input,
                    Emitter<T> emitter)
Description copied from class: DoFn
Processes the records from a PCollection.

Specified by:
process in class DoFn<T,T>
Parameters:
input - The input record
emitter - The emitter to send the output to

scaleFactor

public float scaleFactor()
Description copied from class: DoFn
Returns an estimate of how applying this function to a PCollection will cause it to change in side. The optimizer uses these estimates to decide where to break up dependent MR jobs into separate Map and Reduce phases in order to minimize I/O.

Subclasses of DoFn that will substantially alter the size of the resulting PCollection should override this method.

Overrides:
scaleFactor in class DoFn<T,T>

and

public static <S> FilterFn<S> and(FilterFn<S>... fns)

or

public static <S> FilterFn<S> or(FilterFn<S>... fns)

not

public static <S> FilterFn<S> not(FilterFn<S> fn)


Copyright © 2012. All Rights Reserved.